Mastering DynamoDB with Python: A Comprehensive Guide:
Introduction:
DynamoDB, a fully managed NoSQL database service provided by Amazon Web Services (AWS), has gained immense popularity for its scalability, flexibility, and high-performance capabilities. In this article, we will delve into the world of DynamoDB and explore how to utilize its power with Python to build robust and scalable applications.
Why DynamoDB?:
DynamoDB is designed to handle large volumes of data and provide seamless performance even under heavy workloads. Its serverless nature eliminates the need for provisioning and managing infrastructure, making it an ideal choice for modern applications that demand scalability and low operational overhead. Whether you're building a high-traffic e-commerce platform, a real-time analytics dashboard, or a mobile app with varying user demands, DynamoDB can cater to your data storage and retrieval needs.
Getting Started:
To start working with DynamoDB using Python, you'll need the boto3
library, AWS SDK for Python. Install it using pip
:
pip install boto3
Connecting to DynamoDB:
Before you can interact with DynamoDB, you need to authenticate and connect to your AWS account. You can achieve this by setting up AWS credentials using AWS CLI or environment variables. Once configured, use the boto3
library to establish a connection:
import boto3
# Connect to DynamoDB
dynamodb = boto3.resource('dynamodb')
Creating a Table: Let's create a simple example of a DynamoDB table to store user data:
# Create a table
table = dynamodb.create_table(
TableName='UserTable',
KeySchema=[
{
'AttributeName': 'user_id',
'KeyType': 'HASH'
}
],
AttributeDefinitions=[
{
'AttributeName': 'user_id',
'AttributeType': 'N'
}
],
ProvisionedThroughput={
'ReadCapacityUnits': 5,
'WriteCapacityUnits': 5
}
)
# Wait until the table is created
table.meta.client.get_waiter('table_exists').wait(TableName='UserTable')
CRUD Operations: DynamoDB supports basic Create, Read, Update, and Delete (CRUD) operations. Let's see how to perform them using Python:
# Insert data
table.put_item(
Item={
'user_id': 1,
'name': 'John Doe',
'email': 'john@example.com'
}
)
# Retrieve data
response = table.get_item(
Key={
'user_id': 1
}
)
user_data = response['Item']
# Update data
table.update_item(
Key={
'user_id': 1
},
UpdateExpression='SET email = :new_email',
ExpressionAttributeValues={
':new_email': 'newemail@example.com'
}
)
# Delete data
table.delete_item(
Key={
'user_id': 1
}
)
Conclusion: DynamoDB offers a powerful and flexible solution for managing data at any scale, and integrating it with Python using boto3 enables developers to build robust and scalable applications. From creating tables to performing CRUD operations, DynamoDB simplifies data management and provides the foundation for building modern, data-driven applications. As you explore the possibilities of DynamoDB with Python, you'll discover how to leverage its features to meet your application's specific needs and ensure optimal performance in a serverless environment.
GetRecordAPI: Get record from DynamoBD using where clause
import pyodbc
import boto3
import pandas as pd
client = boto3.client(
'dynamodb',
aws_access_key_id="****************",
aws_secret_access_key='*******************************',
region_name='us-east-1',
)
dynamoDBTable = dynamodb.Table('dynamoDBTableName')
# get template
response = client.query(
TableName='dynamoDBTableName',
IndexName='template_id-index',
KeyConditionExpression='template_id = :template_id',
ExpressionAttributeValues={
':template_id': {'S': str(templateId)}
},
ScanIndexForward= True
)
BatchWrite API: The BatchWriteItem operation in DynamoDB allows you to insert, update, or delete multiple items in one request. When inserting multiple items, the BatchWrite API groups them into manageable chunks and submits them concurrently, making it an ideal choice for improving data insertion efficiency. In the following example you will see that how you can put thousands of records in one time. following example shows you to put large xlsx file in to DynamoDB.
import pyodbc
import boto3
import pandas as pd
client = boto3.client(
'dynamodb',
aws_access_key_id="****************",
aws_secret_access_key='*******************************',
region_name='us-east-1',
)
dynamoDBTable = dynamodb.Table('dynamoDBTableName')
s3OBJ = boto3.client("s3", aws_access_key_id=awsAccessKeyId, aws_secret_access_key=awsSecretAccessKey)
fileObj = s3OBJ.get_object(Bucket=bucketname, Key=filename)
fileContent = fileObj["Body"].read()
readExcelData = io.BytesIO(fileContent)
xls = pd.ExcelFile(readExcelData)
try:
with dynamoDBTable.batch_writer() as writer:
for sheet in xls.sheet_names:
sheetIndex = 1
df2 = pd.read_excel(xls, sheet)
fileRecords = df2.to_json(orient='records')
recordsTOJSON = json.loads(fileRecords)
i=1
for row in recordsTOJSON:
item = {}
item["id"] = str(event["id"])
item["name"] = event["name"]
item["surName"] = event["surName"]
item["_id"] = str(uuid.uuid4())
item['sheeet'] = sheet
item['row'] = json.dumps(row)
item['row_index'] = i
item['sheetIndex'] = sheetIndex
writer.put_item(Item=item)
i = i+1
sheetIndex = sheetIndex +1
except:
raise
Delete records using batch_writer: Delete all records from dynamo DB using python
client = boto3.client(
'dynamodb',
aws_access_key_id="****************",
aws_secret_access_key='*******************************',
region_name='us-east-1',
)
dynamoDBTable = dynamodb.Table('dynamoDBTableName')
# delete template
with dynamoDBTable.batch_writer() as writer:
while 'LastEvaluatedKey' in response:
print(count)
for Item in response["Items"]:
writer.delete_item(
Key={
'_id': Item["_id"]['S']
}
)
response = client.query(
TableName='lab_tempaltes',
IndexName='template_id-index',
KeyConditionExpression='template_id = :template_id',
ExpressionAttributeValues={
':template_id': {'S': str(templateId)}
},
ScanIndexForward= True,
ExclusiveStartKey=response['LastEvaluatedKey']
)
count = count+1
for Item in response["Items"]:
print(Item["_id"]['S'])
writer.delete_item(
Key={
'_id': Item["_id"]['S']
}
)