The Python standard library ipaddress also suffers from the critical IP address validation vulnerability identical to the flaw that was reported in the “netmask” library earlier this year.
The researchers who had discovered the critical flaw in netmask, also discovered the same flaw in this Python module and have procured a vulnerability identifier: CVE-2021-29921.
The regression bug crept into Python 3.x’s ipaddress module as a result of a change made in 2019 by Python maintainers.
Leading zeroes stripped from IP addresses
In March, BleepingComputer had first reported on a critical IP validation vulnerability in the netmask library used by thousands of applications.
The vulnerability, tracked by CVE-2021-28918 (Critical), CVE-2021-29418 (Medium), and CVE-2021-29424 (High) existed in both npm and Perl versions of netmask, and some other similar libraries.
It turns out, the ipaddress standard library introduced in Python 3.3 is also impacted by this vulnerability, as disclosed this week by Victor Viale, Sick Codes, Kelly Kaoudis, John Jackson, and Nick Sahler.
Tracked as CVE-2021-29921, the bug concerns improper parsing of IP addresses by the ipaddress standard library.
Python’s ipaddress module provides developers with functions to easily create IP addresses, networks, and interfaces; and to parse/normalize IP addresses inputted in different formats.
An IPv4 address can be represented in a variety of formats, including decimal, integer, octal, and hexadecimal, although most commonly seen IPv4 addresses are expressed in the decimal format.
For example, BleepingComputer’s IPv4 address represented in decimal format is 104.20.59.209, but the same can be expressed in the octal format as, 0150.0024.0073.0321.
Say you are given an IP address in decimal format, 127.0.0.1, which is widely understood as the local loopback address or localhost.
If you were to prefix a 0 to it, should an application still parse 0127.0.0.1 as 127.0.0.1 or something else?
Try this in your web browser. In tests by BleepingComputer, typing 0127.0.0.1/ in Chrome’s address bar has the browser treating the entire string as an IP address in octal format.
On pressing enter or return, the IP in fact changes to its decimal equivalent of 87.0.0.1, which is how most applications are supposed to handle such ambiguous IP addresses.
Of particular note is the fact, 127.0.0.1 is not a public IP address but a loopback address, however, its ambiguous representation changes it to a public IP address leading to a different host altogether.
According to IETF’s original specification, for ambiguous IP addresses, parts of an IPv4 address can be interpreted as octal if prefixed with a “0.”
But, in the case of the Python standard library ipaddress, any leading zeros would simply be stripped and discarded.
A proof-of-concept test by researchers Sick Codes and Victor Viale shows Python’s ipaddress library would simply discard any leading zeroes.
In other words, when parsed by Python’s ipaddress module, ‘010.8.8.8’ would be treated as ‘10.8.8.8’, instead of ‘8.8.8.8’.
“Improper input validation of octal strings in Python 3.8.0 thru v3.10 stdlib ipaddress allows unauthenticated remote attackers to perform indeterminate [Server-Side Request Forgery (SSRF), Remote File Inclusion (RFI), and Local File Inclusion (LFI) attacks] on many programs that rely on Python stdlib ipaddress,” state the researchers.
For example, had an anti-SSRF bypass blocklist been relying on Python’s ipaddress to parse a list of IPs, ambiguous IPs could easily be slipped in and render the anti-bypass protections futile.
Regression bug introduced in 2019, patch due to be released
Although ipaddress module was introduced in Python 3.3, this regression bug crept into the module starting with Python version 3.8.0 through 3.10, according to the researchers.
Prior to v3.8.0a4, Python’s ipaddress had some checks in place that rejected IP addresses provided in mixed-formats (i.e. octal and decimal) altogether:
However, as seen by BleepingComputer, starting with Python version 3.8.0a4, these checks were removed entirely.
“Stop rejecting IPv4 octets for being ambiguously octal. Leading zeros are ignored, and no longer are assumed to specify octal octets. Octets are always decimal numbers. Octets must still be no more than three digits, including leading zeroes,” programmer Joel Croteau had noted at the time when committing this change.
A disussion had shortly followed among Python maintainers as to the reasons behind this commit, and practical reasons for introducing this change when it came to handling ambiguous IP addresses.
Although discussions about an upcoming patch are ongoing, exact details on what version of Python will contain it are fuzzy.
One of the Python maintainers has suggested a different approach instead:
“It’s uncommon to pass IPv4 addresses with leading zeros.”
“If you want to tolerate leading zeros, you don’t have to modify the [sic] ipaddress for that, you can pre-process your inputs: it works on any Python version with or without the fix,” said Python maintainer Victor Stinner, proposing an alternative workaround to the issue:
Further discussion is ongoing among Python maintainers including Joel Croteau, Christian Heimes, and Victor Stinner on what is the best way to address this issue.
The researchers’ detailed technical findings are provided in a blog post.