Python Scanning

SourceClear Software Composition Analysis

Finding vulnerabilities in your Python repositories using SourceClear is simple. In the following section, you will find steps for running a SourceClear scan on Python repositories using the SourceClear Command Line Interface, but scanning can be performed by any of our CI Integrations as well.

Requirements

Scanning a repository which utilizes PyPi for package management requires the ability to assemble your libraries within the environment you intend to scan the project in. This ability includes the following requirements based on the various build and package managers:
  • Requirements for the SourceClear agent
  • Code repository using Python 2.x or 3.x
  • Python 2.x or 3.x installed on your path
  • pip installed on your path
  • One of the following in the repository to be scanned: setup.py, requirements.txt, requirements-dev.txt, dev-requirements.txt

Running a scan

You can use SourceClear to scan any code repository which you have access to and fulfills the above requirements. To demonstrate how to run a scan, you can clone one of SourceClear’s public Python repositories:

git clone https://github.com/srcclr/example-python
Note: You can also scan code repositories hosted on git by using the --url argument with the CLI agent (see documentation for usage), but for the purposes of this guide it will be assumed you have the code stored locally.

After you have cloned the code to your desktop, point the SourceClear CLI agent at the directory of the repository and scan away:

# Replace "example-python" with the project folder name of your choosing
srcclr scan path/to/example-python

To view more verbose output during the scan process, you can add the --loud argument as well:

srcclr scan path/to/example-python --loud

The SourceClear agent will then proceed to use either a setup.py, requirements.txt, requirements-dev.txt, or dev-requirements.txt within a virtual environment to pip install and identify the dependencies within your project.

Once the agent has evaluated the open source libraries in use, a summary of the scan results will be produced which will include counts for total libraries used, vulnerable libraries, percentage of third party code, vulnerable methods in use, as well as a list of the vulnerabilities found:

Configuring scans

One of the requirements for SourceClear scanning is access to the dependencies being used, and many Python repositories require a specific scope or configuration option (i.e. specifying the usage of system site packages). By adding a srcclr.yml file to the directory where you point the SourceClear agent, you can specify scan directives which can be used for scanning your Python code. The following are configuration options which can be used within your srcclr.yml for Python scanning:

Directive Description
scope Specifies scope of dependency resolution
use_system_pip If set to true, the agent uses the pip installed on the user machine rather than bundled pip
pip_requirements_file Specifies the location and name of a non-standard requirements file
system_site_packages When set to true, adds system site packages to virtualenv image used for scanning

Viewing scan results

After completing a scan, the bottom of the output in your terminal will include a link to the SourceClear platform to view the scan results in more detail:

Licenses
Unique Library Licenses                     4
Libraries Using GPL                         0
Libraries With No License                   1

Full Report Details               https://acmedemo.sourceclear.io/teams/Qx2xtF1/scans/1555923

Navigating to this link will allow you to view the results of your scan in it’s entirety.

The scan results are broken down into the following categories:

  • Issues: This is comprised of out of date libraries, license violations, and vulnerabilities uniquely associated to a specific version of a library within a specific repository.
  • Vulnerabilities: This list represents the set of unique vulnerabilities across a specific project. If multiple libraries in a given project are associated with the same vulnerability, the vulnerability will only appear once in this list.
  • Libraries: Libraries consist of each open-source library that SourceClear has identified within a code project. SourceClear maintains a database which is in sync with PyPi in order to provide the most up to date information on your Python libraries.
  • Licenses: Licenses allow users to view the software license information associated with each open-source library in use. SourceClear maintains license information by keeping in sync with PyPi as described above.

You can find more details on these categories in the Issues, Vulnerabilities, Libraries, and Licensesdocumentation article.

Fixing vulnerability issues

After viewing the scan results, users will likely want to fix the vulnerabilities discovered in their Ruby project. SourceClear provides clear instructions for fixing vulnerability issues through the web interface.

Fixing a direct vulnerability

When a library is specifically referenced in your setup.py, requirements.txt, requirements-dev.txt, or dev-requirements.txt, SourceClear refers to the library as a “direct” dependency. Fixing a vulnerability in a direct dependency using SourceClear is simple. Using the open-source project mentioned in the Running a scan section and after having navigated to the project scan results within the SourceClear UI, you can filter down to “Vulnerability” issues which are included only in “Direct Libraries”:

After filtering the scan results, you can drill into an issue to find out how to fix it by clicking the issue id next to the vulnerability name. This will bring you to the issue details page, where you will find information on fixing the vulnerability. In general, the best way to fix a vulnerability in a direct dependency is to update the version in use to the version recommended by SourceClear. SourceClear recommends a version which is not associated with the vulnerability you are subject to, in addition to any other vulnerabilities which might result from switching to a different version. In order to prevent the update from having significant impact on your code, the recommended version will be the closest to your current version while still not being associated with other vulnerabilities.

Note: Some libraries include vulnerabilities which have not yet been fixed, and therefore SourceClear cannot provide a version to update to. In cases such as this, it is recommended you either create a pull request to the unfixed library or use a different library in your code.

As an example, the following provides a fix for a “Denial of Service (DoS) Memory Consumption”vulnerability in feedparser, version 5.1.1 in the example-python repository.

Instructions

Update the requirements.txt file in the root of the project (or whichever file is specified in the details section) to match the following:

feedparser== 5.1.2

Once you have completed these steps, validate the fix.

Fixing a transitive vulnerability

Direct dependencies often depend on other libraries which are referred to as transitive dependencies. Vulnerabilities in transitive dependencies are common because often the developer does not realize that the library they are adding to their project depends on a vulnerable library without having a tool such as SourceClear to show this information. Fixing vulnerabilities in transitive dependencies can be difficult because the direct dependency may require a specific version rather than a version range. You can find details on issues in transitive dependencies by filtering down your issues by “Vulnerabilities” and leaving the “Direct Libraries” checkbox unchecked. Transitive vulnerabilities are indicated in the “Library” column by the smaller arrow next to the library name: . Selecting the issue number to view the issue details will additionally provide the “Type” of library; either direct or transitive.

Fixing a transitive library for Python involves overriding the transitive dependency by adding the appropriately versioned dependency as a direct library to your configuration file which could be in the form of a requirements.txt or setup.py. As an example, the following provides a fix for a “Cross-Site Scripting (XSS)” vulnerability in html5lib, version .9999999 in the transitive_vulnsbranch of the example-python repository.

Instructions

Update the requirement.txt file in the root of the project and add the recommended version of the library:

html5lib== 0.99999999
Note: Updating some transitive libraries will fail because a specific version is required for usage. In cases such as these, you will need to update the directly specified library to a version which allows for the safe version to be used.

Once you have completed these steps, validate the fix.

Fixing a vulnerable method

Within the issues across a given project, you can filter your list to display only vulnerabilities where a vulnerable method is in use by clicking the “Vulnerable methods” checkbox above your issues list. If a vulnerable method is shown to be in use, as indicated by the warning icon (), it means that the specific piece of code which causes a given library to be vulnerable is being used by the code project it is found in. This is a crucial distinction from other vulnerabilities where you might not be using the vulnerable part of the code, therefore making the issue important but more a matter of code hygiene where you would want to prevent future developers from using this library in the future.

Within the issue details for a vulnerability where a vulnerable method in use, SourceClear provides the full call path for every instance of a given vulnerable method. This helps users evaluate the importance of the vulnerability based on the usage within their project and alter their actual code rather than fixing the vulnerability by updating the library.