AWS DynamoDB

You are here:
< All Topics

DynamoDB is a fully managed, highly-available database with replication across multiple AZs

 

NoSQL – not a relational database! – just simple key:value

 

single digit millisecond performance.

 

 

scales to massive workloads

 

100TBs of storage

 

fast and consistence, low latency retrieval

 

integrated with iam for security authorization and admin
enables event-driven programming via DynamoDB Streams

 

low cost and auto-scaling

 

important for exam: 

no db provisioning needed you just create a table

 

 

Has two able classes: standard and infrequent access IA table class

 

 

Basics of DynamoDB

 

 

v important – also exam q:
made up of tables – there is NO SUCH THING AS A DATABASE – you only need to create tables!
the db already exists!

 

 

you just create tables – 

 

each table has a primary key – must be set at creation time
each table can have an infinite number of items ie rows

each item has attributes can be added over time or they can be null

this is much easier to do at any time than with a conventional relational db.

 

max size of an item is 400KB so not good for large objects

 

data types supported are:

 

scalar types: string, number, binary, boolean, null
document types: list, map
set types: string set, number set, binary set

 

tables:

 

have primary key – containing partition key (eg user_id) and sort key (eg game_id)
and
attributes: eg score, result – these are items which are NOT the primary key

 

 

Is a great choice for a schema where you need to rapidly evolve the schema, better than the others for this.

 

read/write capacity modes

 

control how you manage your table capacity – ie the read and write throughput

 

2 Capacity Modes for DynamoDB

 

important for exam:

 

Provisioned mode – is default,  provisioned in advance

 

we have 2 possible capacity modes:

 

provisioned mode, where you specify number of read and writes per second, this is the default.

 

very good for predictable modes.

 

You have to plan the capacity for this in advance, you pay for provisioned read capacity units RCUs and write capacity units (WCUs)

 

these “capacity units” are used to set your desired capacity for your database! – set these in the web dashboard of dynamodb when you are provisioning.

 

you can also add auto scaling mode for both rcu and wcu

 

and

 

on-demand mode – much more expensive, but better for unpredictable workloads or sudden transaction spikes where provisioned mode cant scale sufficiently 

 

on-demand mode:

read write automatically scales up and down acc to workload, so no capacity planning needed
you pay for your use, is more expensive 2-3x more

 

good for unpredictable workloads which can be large or small varying

 

need to know for exam!

 

remember always you just create tables, never a database with dynamodb!

 

you can specify read and write capacity autoscaling separately

 

 

DynamoDB Accelerator DAX

 

 

a fully manageable, highly available seamless in-memory cache for dynamodb

 

helps solve read congestion by means of memory caching

 

microseconds latency for cached data

 

does not require any application logic modification – so it is fully compatible with existing dynamodb api

 

applications can access the database via DAX for much faster reads

 

TTL is 5 mins default for dax data

 

dax is different from elasticache in that it is meant for dynamodb and does not change the api
good for individual object cache

 

elasticache
good for storing aggregation result where further processing required

 

 

DynamoDB Stream

 

very easy – is an ordered stream of data that represents items when created or updated or deleted

 

can then be sent on to kinesis datastreams
can be read by lambda or kinesis client library apps

 

data is retained for 24 hrs

 

 

use cases:

 

reacting to changes in real time eg welcome emails to new users
to do analytics
to insert into derivative tables or into elastic serarch
or implement cross region replication

 

 

Summary of DynamoDB Streams

 

So, to summarize DynamoDB Streams

 

Application -> create/update/delete actions to the TABLE -> Dynamo DB Streams

 

and from the TABLE -> Kinesis Data Streams -> Kinesis Data Firehose

 

Kinesis Data Firehose

-> for analytics -> Redshift
-> for archiving -> S3
-> for indexing -> OpenSearch

 

 

 

DynamoDB Global Tables

 

 

are cross-region

 

two way or multi-way replication

 

to provide for low latency across regions

 

it is
active-active replication

 

this means apps can read and write to the table from any region

 

must enable dynamodb streams for this

 

TTL expiry: automatically deletes items after an expiry timestamp

 

eg 1 month for each row in the table

 

 

DynamoDB Indexes – only need to know at high level for exam

 

basically, these allow you to query on attributes other than the primary key

 

2 types:

 

gsi – global secondary and lsi – local secondary

 

(all you need to know for now)

 

by default you query on the primary key, but you can use gsi or lsi to query on other attributes.

 

 

Transactions in DynamoDB

 

 

these allow you to write to 2 tables at the same time:

 

the transaction however MUST write to both or else none – in order to keep the tables accurate – so data consistency is maintained

 

 

 

 

 

 

 

 

Table of Contents