Manage logs of a Python app on AWS Fargate using Datadog Logs and CloudWatch.

If Dog is a man's best friend, Log is a developer's best friend.

Logging is an important part of an application. Logging can be very helpful when it comes to debugging unexpected errors or performance issues.

In this article, we are going to set up logging on a Python (Django) application and send these logs to AWS CloudWatch and Datadog. Datadog Log Management offers interesting features like the integration between APM/metrics/logs, which allows you to, for example, quickly check application logs during a latency spike.

To send logs to both destinations, we are going to use a custom image of FireLens for ECS.

Configuring application logs

First of all, we need to configure our application logger. The easiest way to parse logs and avoid problems is to emit logs in JSON format. There are several Python loggers for this purpose, we are going to use python-json-logger.

We use the following logging config. Feel free to tweak it to fulfill your needs. dd.trace_id and dd.span_id parameters are used to associate APM traces/spans with logs information.

LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'json': {
            '()': 'pythonjsonlogger.jsonlogger.JsonFormatter',
            'format': '%(asctime)s %(levelname)s [%(name)s] [%(filename)s:%(lineno)d]'
            '[dd.trace_id=%(dd.trace_id)s dd.trace_id=%(dd.span_id)s]'
            ' - %(message)s'
        }
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'json'
        }
    },
    'loggers': {
        # root logger
        '': {
            'handlers': ['console'],
            'level': 'WARNING',
            'propagate': False,
        },
        'django': {
            'handlers': ['console'],
            'propagate': False,
        },
        'celery': {
            'handlers': ['console'],
            'level': os.environ.get('LOG_LEVEL', 'INFO'),
            'propagate': False,
        },
        'ddtrace': {
            'handlers': ['console'],
            'level': 'ERROR',
            'propagate': False,
        }
    }
}

This will produce logs like this one.

{"asctime": "2019-12-27 16:15:28,130", "levelname": "ERROR", "name": "graphql.execution.executor", "filename": "executor.py", "lineno": 454, "dd.trace_id": 1316555880157986196, "dd.span_id": 9561506127904666252, "message": "An error occurred while resolving field Query.user", "exc_info": "Traceback (most recent call last):\n  File \"/usr/local/lib/python3.7/site-packages/graphql/execution/executor.py\", line 450, in resolve_or_error\n    return executor.execute(resolve_fn, source, info, **args)\n  File \"/usr/local/lib/python3.7/site-packages/graphql/execution/executors/sync.py\", line 16, in execute\n    return fn(*args, **kwargs)\n  File \"/opt/app/apps/authentication/middleware/auth/gql_middleware.py\", line 23, in resolve\n    authenticated_user = self.middleware.identify_user(validated_token)\n  File \"/opt/app/apps/authentication/middleware/auth/utils.py\", line 52, in identify_user\n    jwt_payload = self._decode_token(token)\n  File \"/opt/app/apps/authentication/middleware/auth/utils.py\", line 45, in _decode_token\n    raise self.RaiseTokenExpiredException(_(str(jwt_payload['reason'])))\ngraphql_core.exceptions.GraphQLTokenExpiredError: Token is expired"}

Set up Firelens custom Docker image

FireLens streamlines logging by enabling you to configure a log collection and forwarding tool such as Fluent Bit directly in your Fargate tasks. You can easily send logs using it to CloudWatch and Datadog, but when you want to send them to both destinations, things are a bit more complicated.

To send logs to both destinations, we have to create a custom configuration file. The easiest way to do this is by pulling it from S3, but this is not supported by Fargate tasks, so we will have to create a custom Docker image with the custom config file embedded.

The custom file has to have the following format.

[OUTPUT]
    Name cloudwatch
    Match   *
    region $REGION
    log_group_name $LOG_GROUP_NAME
    log_stream_prefix $LOG_PREFIX
    auto_create_group true
    log_key log

[OUTPUT]
    Name datadog
    Match  *
    TLS on
    apikey $DD_API_KEY
    dd_service $DD_SERVICE
    dd_source $DD_SOURCE
    dd_tags $DD_TAGS

As you can see, there are some dynamic variables that need to be configured per task. To configure these parameters dynamically, we are going to pass them as environment variables and use envsubst to replace the values on the docker-entrypoint.sh.

So, this is the Dockerfile we are going to use. We use as base image amazon/aws-for-fluent-bit, install aws-cli and gettext packages (for envsubst), copy the previous config file and add a custom entrypoint.

FROM amazon/aws-for-fluent-bit:latest

RUN yum -y update && yum -y install aws-cli gettext

ADD datadog_cloudwatch_placeholders.conf /datadog_cloudwatch_placeholders.conf

COPY docker-entrypoint.sh /
RUN chmod +x /docker-entrypoint.sh

ENTRYPOINT ["/docker-entrypoint.sh"]

In the docker-entrypoint.sh we pull a file containing the variables from S3, set them as environment variables and replace them in the config file using envsubst. Finally, we start fluent-bit using the custom config.

#!/bin/sh
set -e

if [ -n "$APP_SETTING_S3_FILE" ]
then
    aws s3 cp "$APP_SETTING_S3_FILE" /tmp/env
    echo Exporting .env file
    eval $(cat /tmp/env | sed 's/^/export /')

    envsubst < /datadog_cloudwatch_placeholders.conf > /extra.conf
fi

./fluent-bit/bin/fluent-bit -e /fluent-bit/firehose.so -e /fluent-bit/cloudwatch.so -e /fluent-bit/kinesis.so -c /fluent-bit/etc/fluent-bit.conf

After you create the image, you'll need to upload it to a Docker Registry (ECR, DockerHub...) to be able to use it on your Fargate tasks.

The format of the S3 file is the following.

REGION=<REGION>
LOG_GROUP_NAME=<LOG_GROUP_NAME>
LOG_PREFIX=<LOG_PREFIX>
DD_API_KEY=<DD_API_KEY>
DD_SERVICE=<DD_SERVICE>
DD_SOURCE=<DD_SOURCE>
DD_TAGS=<DD_TAGS>

Configure Fargate tasks to use FireLens log router

Next, we need to configure our Fargate tasks to use FireLens log driver and set up a new container to use as a log router.

In the ECS task definition, we need to add a new container with these parameters.

  {
    "essential": true,
    "image": "<DOCKER_IMAGE_URL>",
    "name": "log_router",
    "environment": [
      {
        "name": "APP_SETTING_S3_FILE",
        "value": "<S3_SETTINGS_FILE>"
      }
    ],
    "firelensConfiguration": {
	    "type": "fluentbit",
        "options": {
            "config-file-type": "file",
            "config-file-value": "/extra.conf",
            "enable-ecs-log-metadata": "true"
        }
    }
  }

Here, we create a new container that will be used as log router, using the custom image we created before and specifying a custom configuration file for FluentBit.

Next, we need to change the log driver to awsfirelens on the containers we want to send logs to FireLens.

    "logConfiguration": {
      "logDriver": "awsfirelens"
    }

This is all we should need to start seeing logs on CloudWatch on Datadog. Remember that your ECS task requires the following permissions to create new CloudWatch Log Groups and write to them.

    "logs:CreateLogStream",
    "logs:CreateLogGroup",
    "logs:DescribeLogStreams",
    "logs:PutLogEvents"

Parse Datadog Logs

Finally, we need to parse logs on Datadog to be able to filter them by all the parameters and associate them with APM data.

To do that, first of all, we need to go to Logs/Configuration and modify the Reserved Attributes Mapping.

Then, we need to create a custom pipeline. We will create a new Grok Parser to parse JSON format.

We also need to remap some attributes to the official Datadog log values.

And that's all! After that, we will see something like this for the logs.

And if we go to APM traces, we can easily jump to the associated logs!

Manage logs of a Python app on AWS Fargate using Datadog Logs and CloudWatch.

Configuring application logs

Set up Firelens custom Docker image

Configure Fargate tasks to use FireLens log router

Parse Datadog Logs

Thanks for reading!

Jose López

2019-12-30 | 6 min read

More articles