Learning DynamoDB

Foreword From 2021

A while ago, I listened to an amazing podcast episode about DynamoDB decided to do one of AWS’s DynamoDB hands-on tutorials and create a little presentation about what I learned to share at a meetup.

Enjoy!

Objectives

Get to know each other
Compare and contrast DynamoDB data modeling to relational data modeling
Describe the importance of partitioning in DynamoDB

Non-Work Facts About Me!

Chuck

My first job was at a doggy daycare
I randomly know a lot about aphids
I graduated with high honors in pure math and produced a thesis as an undergraduate
I love building PCs to play big beautiful video games
I am a podcast nerd:
- Freakonomics, Software Engineering Daily, Intelligence Squared, Not Another DnD Podcast, Radiolab, and more
My wife Caroline, my son Rafael, and my son Arlo are my best friends and make me laugh every day

Notice and Wonder

What do you notice?

What do you wonder?

Notice and Wonder Debrief

NoSQL vs. Relational Data Modeling — Online Gaming

Think of an online multiplayer game.

What are some "entities" involved?

What are some access patterns?

The Relational Way

Suppose we want to query for all the usernames of users in a particular game.

Star Schema	Pros	Cons
Table for "user" Table for "game" Fact table that relates "user_ID" with "game_ID" Join "user" and "game" tables and filter for records with the given "game_ID"	Great for ad-hoc queries Great for analytics, like aggregations	Joins don’t scale well

Star Schema

Pros

Cons

Table for "user"
Table for "game"
Fact table that relates "user_ID" with "game_ID"
Join "user" and "game" tables and filter for records with the given "game_ID"

Great for ad-hoc queries
Great for analytics, like aggregations

Joins don’t scale well

The DynamoDB Way

This is all one big, hyper-scalable table! It looks like two entity tables and a join table, but really it’s one big table where the entities and join are created by clever design of the primary and sort keys, which we’ll see in the next slide.

Note	One table with O(1) lookup by primary key, or O(log(n)) if including a sort key. No joins! You plan the access pattern into your data model and include everything in one table.

Design the Primary Key

Table 1. Composite Primary Key
Entity	Partition Key (a.k.a. HASH)	Sort Key (a.k.a. RANGE)
User	`USER#<USERNAME>`	`#PLACEHOLDER#<USERNAME>`
Game	`GAME#<GAME_ID>`	`#PLACEHOLDER#<GAME_ID>`
UserGameMapping	`GAME#<GAME_ID>`	`USER#<USERNAME>`

Code Example

Query multiple entities in one request to one table!

Take a moment to think about what this query does:

game_id = input("what game ID do you want to look up?")
resp = dynamodb.query(
        TableName='battle-royale',
        KeyConditionExpression="PK = :pk AND SK BETWEEN :placeholder AND :users",
        ExpressionAttributeValues={
            ":pk": { "S": "GAME#{}".format(game_id) },
            ":placeholder": { "S": "#PLACEHOLDER#{}".format(game_id) },
            ":users": { "S": "USER$" },
        },
        ScanIndexForward=True
    )

sample response

This query takes a given game ID and looks up all the users in that game. Let’s say the game ID is abc123. The PK part of the KeyConditionExpression stands for "Primary Key", which is the key used to determine which DynamoDB partition holds the data.

Now that we have found the partition where game ID abc123 lives, the next step is to use the sort key, SK, to grab all the users. We look up records in order from :placeholder to :users, which are further specified in the ExpressionAttributeValues parameter. The :placeholder expression corresponds to the string #PLACEHOLDER#abc123 (type denoted by S). The expression :users corresponds to the string USER$.

The # symbol is 35 in ascii, and $ is 36, so USER$ is greater than all USER#<USERNAME> entries. That means this query will return the game followed by all the users in that game in ascending order by username. For example:

{"GAME#abc123" : "#PLACEHOLDER#abc123"}
{"GAME#abc123": "USER#myuser1"}
{"GAME#abc123": "USER#myuser2"}
{"GAME#abc123": "USER#myuser3"}
...

Next Step: Secondary Indexes

Secondary indexes are very useful in DynamoDB. They allow you to query data by attributes other than the primary key. This opens up more access patterns. The benefits of secondary indexes come at the cost ($ literally $) of more reads and writes as DynamoDB automatically updates indexes. It is easy to use secondary indexes poorly, so I encourage you to dig into the tutorial in the reference to learn more about best practices with secondary indexes. Namely:

Sparse secondary index
- Create an index that filters down to a specific subset that won’t grow indefinitely
- example: create index to find all open games on a specific map
Inverted index
- Switch the roles of primary key and sort key
- example: find all games a user has played (opposite of what we did earlier when we found all users in a specific game)

Learning DynamoDB

Foreword From 2021

Objectives

Non-Work Facts About Me!

Notice and Wonder

Notice and Wonder Debrief

NoSQL vs. Relational Data Modeling — Online Gaming

The Relational Way

The DynamoDB Way

Design the Primary Key

Code Example

Next Step: Secondary Indexes

3, 2, 1 Reflection

Objectives

References

Foreword From 2021#

Objectives#

Non-Work Facts About Me!#

Notice and Wonder#

Notice and Wonder Debrief#

NoSQL vs. Relational Data Modeling — Online Gaming#

The Relational Way#

The DynamoDB Way#

Design the Primary Key#

Code Example#

Next Step: Secondary Indexes#

3, 2, 1 Reflection#

Objectives#

References#

Foreword From 2021

Objectives

Non-Work Facts About Me!

Notice and Wonder

Notice and Wonder Debrief

NoSQL vs. Relational Data Modeling — Online Gaming

The Relational Way

The DynamoDB Way

Design the Primary Key

Code Example

Next Step: Secondary Indexes

3, 2, 1 Reflection

Objectives

References