Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
B
blog
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
nimrod
blog
Commits
2ed83fd5
Commit
2ed83fd5
authored
4 years ago
by
nimrod
Browse files
Options
Downloads
Patches
Plain Diff
Post about structured logging.
parent
96940ec0
No related branches found
No related tags found
No related merge requests found
Pipeline
#1403
passed
4 years ago
Stage: test
Stage: deploy
Changes
1
Pipelines
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
content/structured-logging.rst
+158
-0
158 additions, 0 deletions
content/structured-logging.rst
with
158 additions
and
0 deletions
content/structured-logging.rst
0 → 100644
+
158
−
0
View file @
2ed83fd5
Human and structured logs
=========================
:date: 2021-05-08
:summary: Output both human readable and structured logs
As anyone who's heard of the 12 factor app knows, logs are best outputted to the
standard output of the process and have the environment handle the logs
accordingly. A while back I worked at a company where we collected logs with the
EFK stack. We had quite a bit of logs and ingested daily and tracking down
issues in Kibana using regular expressions became unwieldy. Structured logging
(read json lines, where every line is a json object) was the answer. It would
allow us to more easily find all of the logs for a single transaction or account
and our EFK stack could parse out-of-the-box.
There was a previous attempt at structured logs before I joined the team that
failed to be merged. It was too invasive, the diff too big, it never worked
correctly and it required too much work to carry the diff forward and eventually
the branch was dropped.
Another complaint I heard from the developers about that previous attempt was
that it made reading the logs when developing locally (reading the logs from the
Docker container) or for failing CI builds more difficult. Indeed, it was not
human-readable.
Others at the company (with experience log4j and enterprise software) wanted to
output all of the logs to 2 files with different formats. I saw this as a
mistake. It would make handling logs inside of containers more difficult. It
would require dealing with log rotation.
Another goal for me was to be able gradually convert components to the new log
handling so we won't end up with a huge diff that's never merged again.
I went with the `structlog <http://www.structlog.org/>`_ package and after
delving in to it, I created a module that only requires importing to use it.
The module configures the log format (human or json) and level according to
environment variables but it doesn't require any change to the logging calls.
The structlog package also comes with a great looking console renderer where
ever control character are valid. Here's how it looks:
.. image:: http://structlog.readthedocs.io/en/stable/_images/console_renderer.png
:alt: Example rendered logs in a console
And here's the module:
.. code:: python
import logging
import logging.config
import structlog
import structlog.stdlib
import structlog.processors
import structlog.dev
import os
emptyLogRecord = logging.makeLogRecord({})
def add_extra_params(logger, log_method, event_dict):
record = event_dict['_record']
for (k, v) in list(record.__dict__.items()):
if k != 'stack_info' and k not in emptyLogRecord.__dict__ and k not in event_dict: # noqa: E501
event_dict[k] = v
return event_dict
pre_chain = [
structlog.processors.StackInfoRenderer(),
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.stdlib.add_log_level,
structlog.stdlib.add_logger_name,
structlog.processors.TimeStamper(fmt='%Y-%m-%d %H:%M.%S'),
structlog.processors.format_exc_info,
add_extra_params,
]
processor = {
'human': structlog.dev.ConsoleRenderer(),
'json': structlog.processors.JSONRenderer(),
}
logFormat = os.environ.get('LOG_FORMAT', 'human')
assert logFormat in processor
try:
logLevel = getattr(logging, os.environ.get('LOG_LEVEL', 'WARN').upper())
except AttributeError:
logLevel = logging.WARN # The default log level.
config = {
'version': 1,
'disable_existing_loggers': True,
'formatters': {
'json': {
'()': structlog.stdlib.ProcessorFormatter,
'processor': processor['json'],
'foreign_pre_chain': pre_chain,
'format': '%(message)s',
},
'human': {
'()': structlog.stdlib.ProcessorFormatter,
'processor': processor['human'],
'foreign_pre_chain': pre_chain,
},
},
'handlers': {
'default': {
'formatter': logFormat,
'class': 'logging.StreamHandler',
},
},
'loggers': {
'': {
'handlers': ['default'],
'level': logLevel,
'propegate': True,
},
},
}
logging.config.dictConfig(config)
structlog.configure(
processors=[processor[logFormat]],
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(), )
I called this module :code:`logging_config` and if a module had imported it,
then logs would be human readable by default and if the :code:`LOG_FORMAT`
environment variable was set to :code:`json` (it would be in the staging and
production environment where we had an EFK stack running and collecting logs)
the output was in json format.
This was the first step and it could be merged without breaking existing
components. However, most logging calls still had formatted strings which didn't
expose all of the data in a structured format. We had to go module by module and
change to have a single, static log message and pass all of the variables in the
:code:`extra` parameter. It would mean changing calls that looked like this:
.. code:: python
logger.error(f"Failed to send message {message} to user {uid}.")
to something like this:
.. code:: python
logger.error("Failed to send message.", extra={"message": message, "uid": uid})
We could add fields to the output easily, deal with missing parameters more
easily, output the Flask request context more easily, etc. We could do this
piecemeal, going through one module at a time and not have to keep a growing
diff maintained. It would keep the human readable logs where needed and have
easily searchable logs in Kibana where available. We kept to the 12 factor app
methodology and the benefits of it. It required little change to our setup and
deployment. It had noticeable benefits out of gate which made the Dev team pitch
in with converting modules to the new logging calls convention.
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment