Python is a valuable programming language, but using it without proper security best practices puts applications at risk of an attack.
Python is a fast, platform-agnostic, and easy-to-learn programming language that is suited for beginners and experienced developers alike. Ever since its first release in 1991, Python has had a constant presence in the computer world and has become a go-to language thanks to its easy-to-understand code and versatility. Today, Python can boast a wide array of libraries and frameworks, and they are the cornerstone of fast and easy Python programming—the so-called Pythonic way of development.
But like all programming languages, Python is not immune to security threats. Secure coding best practices must be adopted to avoid risks from attackers. In this post, we’ll explore Python security best practices that should employed when building secure application.
You should always use up-to-date code to make sure that your software will work without issues and won’t open doors for attackers. Python is no exception to this rule. If we compare Python versions 2 and 3, there are major security advancements in the later release that should help keep your software secure. Another advantage of Python is its big community that takes care of reported security flaws quickly. If you have questions about the current state of Python vulnerabilities, this page should provide some answers.
It’s important to mention that Python versions are not fully compatible with each other—there are differences that will not allow you to run code you wrote in Python 2.x with Python 3.x versions. This raises many issues for developers, as it requires them to rewrite a lot of code in order to move to the later version of Python.
However, the improvements to Python 3.x on the security front make updating not only worth it but important. Those changes are
Developers working with older versions of the code (those no longer supported by Python) need to make sure to validate inputs or avoid calls implemented through dangerous functions, such as those that do shell or process executions. Python 3.x changes language syntax and semantics in a way that is not backward-compatible, so a simple migration of larger codebases is not easy and will introduce substantial challenges, but it is nevertheless advisable.
As mentioned, a large community supports Python and Python libraries, and extends its functionalities. However, it can be difficult to ensure that the packages you pull from the Python Package Index (PyPI) are safe for your project. Although PyPI gives package maintainers the option of signing their submissions so that adopters can validate the download’s integrity and the author’s identity, it’s important to keep in mind that packages in the PyPI don’t go through a security review.
Let’s start with license compliance. Projects are published under a variety of licenses and each has its own license obligations. However, regardless of license type, you need to fulfil all license obligations in order to avoid legal issues. You may want to avoid certain licenses entirely to avoid losing your intellectual property. Read our previous blog post for an more information on license compliance.
Adhering to Python security best practices means making sure that your code is free of vulnerabilities and bugs, so users and customers can use it without danger. There are two types of code to consider here. One is proprietary code—the code that you wrote. Proprietary code is best checked with a static application security testing (SAST) tool like Coverity®, which finds errors introduced during development that could make your code insecure.
The second is the code you get from others. Open source or third-party code—including the dependencies in your code, both direct and transitive—is best handled by a software composition analysis (SCA) tool such as Black Duck®. This type of tool uncovers information about the packages you’re using, including their licensing, security, and operational risk state. Using both SAST and SCA tools helps uncover errors early in the software development life cycle, rather than later when they are more expensive and time-consuming to address.
One key to interactive software is user inputs and the reaction of the software to it. But however useful, those inputs can be highly dangerous, as they can lead to possible injection attacks. One of the most common and simple injection attacks is SQL injection.
SQL injection is a vulnerability in which an attacker influences the queries that software makes to the database. By inserting a relatively simple command, one can, for example, turn an authorization check into administrative access to a web portal. Some web platforms limit the special characters that can be included in usernames and passwords to prevent the use of those characters in a SQL injection attack. Unfortunately, this makes the creation of strong passwords more difficult. Therefore, it’s best to sanitize the inputs—that is, check every input and create rules defining valid inputs, acceptable character sequences, and which combinations to allow. This helps prevent injection attacks while still allowing strong passwords.
To further improve security, make sure your database supports prepared statements. Databases like MySQL, MS SQL Server, and PostgreSQL support this functionality because it can protect against vulnerabilities like SQL injection—and in some cases, it even results in performance improvements in your application, especially if you run SQL statements repeatedly. And in Python, you can use prepared statements even if your database doesn’t support them. Python supports this functionality in its standard libraries and will emulate it on the client side if needed. The biggest security benefit of using prepared statements is the separation of SQL statements and user-provided data. This ensures that user-provided data cannot be abused to modify SQL statements, and will be used literally in the precompiled statement that logic will not change.
Malicious actors often create misspelled domains to catch people who have misspelled their URL. The same can happen when fetching libraries from PyPI. A malicious package with a name similar to a legitimate one could be placed in a repository to trick someone into fetching it by mistake.
In absolute imports, you specify the full path to the package you want to use. In relative imports, you import a package relative to the location of the project where you made the import statement. There are two types of relative imports.
The danger of implicit relative imports is that a poisoned package could find its way into another part of your project (via an import of another library) and then could mistakenly be used instead of the library you intended. Due to the unspecified path and the potential for confusion, the implicit relative import was removed from Python 3.x. However, if you are still using older versions of Python, remove the implicit relative imports and use either the absolute or explicit relative import types.
Using virtual environments is a Python security best practice not only for security reasons, but also to help keep your development environments organized. Imagine an operating system without folder structures. All files—configuration files, all the libraries they are using, text documents, images, music files, and videos—combined in one folder with only the file name to tell them apart. Not only would it be extremely difficult to find the correct file, you would also have problems with file name duplications. This chaos would be a fertile ground for mistakes and security issues. This same chaos is the result if you keep all your files spread throughout your system without any organization. You don’t know what library is used where, or what project will be affected and if you remove something.
When you set up a virtual environment for your project, you must ensure that all the packages you need for the project are available and isolated from other projects on your system. This helps avoid collisions and conflicts between libraries. Virtual environments are also great for those using Python 2.7 and looking for a good way to start projects with Python 3.x, as you can set up a virtual environment that will use Python 3.x without impacting the projects using Python 2.7. And virtual environments enable you to contain any malicious packages you might have inadvertently pulled, so you can avoid affecting your whole system.
Have you ever heard the saying “the internet never forgets”? This is true for images and media, and it’s true for any secrets you might distribute with your code.
Developers sometimes hardcode information such as passwords, URLs with authentication information, and API keys to make testing easier. This is a bad practice, as such hardcoded secrets can be forgotten about and then committed to a code repo like GitHub or similar. Once this happens, those secrets will be included in databases or logs for anyone to see. Make sure that anything you upload—code, readme and configuration files, and especially plain text files—is free of secret information of any kind.
We all know the trial and error paradigm in development. You code and test, and based on the outcome, adjust your code and test again. This is a never-ending process that ensures an ongoing supply of good debugging information. That’s why development environments usually display all debugging output.
And that’s why it’s important to separate the development environment from the production environment. Debugging information on your production system is a security hazard. If the environments aren’t separated, every bug will be communicated to public users, so a malicious actor could get information on how to breach your systems. Switch off any debugging information on the production systems that could be publicly visible, and replace those notifications about issues with exception-handling code that places the information on your internal bug-tracking systems. Users should see only a generalized explanation of the error if needed.
Of course, not displaying debugging information is recommended, but you still need to fix the issues. Take care of all debugging before the system is taken into production. SAST and SCA tools are highly advisable. Coverity SAST tools uncover any development mistakes that lead to vulnerabilities in your proprietary code, and Black Duck SCA checks open source components and their direct or transitive dependencies for any risks they bring into your code. These tests help make sure you get rid of security risks in your code before production.
I hope that this overview of Python security best practices gave you some easy tips for developing with Python. Python can be used in many different areas including machine learning, artificial intelligence, data science, and web development. In web development it acts as a worthy opponent to PHP, and when used together with Django web framework, it can provide fast results. Whether you’re just getting started with Python or are already using it for your development, make sure you’re familiar with the threats and how to avoid them so you build and maintain secure applications.
Boris Cipot is a senior sales engineer at Synopsys. He helps companies of all shapes and sizes to create secure software. Boris joined Synopsys when Black Duck Software was acquired in 2017. He specializes in open source software security, robotics, and artificial intelligence. He has also worked in the cyber security field since 2003 in anti-malware software at F-Secure and Avira.