Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
53576c4
CVE-2007-4559 Tried to copy back only the new bits from Python3.6 Sym…
rickprice May 1, 2024
ad414e6
vendor version 2.6.4 of Expat
rickprice Dec 13, 2024
2a5975c
Add #define for XML_GE which exists in the newer version
rickprice Dec 13, 2024
e8a15ed
Rename struct that conflicts, perhaps with MSVC
rickprice Dec 14, 2024
132b6a0
Fix compile errors caused by missing #DEFINE and rename a conflict
rickprice Dec 13, 2024
1e74097
Switch Expat to the AS Platform Expat
rickprice Dec 16, 2024
6ad96ea
Merge pull request #66 from ActiveState/BE-4504-python-2-7-expat-upda…
icanhasmath Dec 18, 2024
06f2db7
Add tests to show that CVE-2024-6232 is okay
rickprice Jan 2, 2025
666cdf7
Merge pull request #67 from ActiveState/BE-4449-cve-2024-6232
icanhasmath Jan 2, 2025
8701d6c
Merge pull request #65 from ActiveState/BE-4504-python-2-7-expat-upda…
icanhasmath Jan 7, 2025
2592f32
CVE-2007-4559 Remove invalid tests, get others working with Python2
rickprice May 1, 2024
4801727
Add in ActiveTests so its easier to test in future
rickprice Jan 6, 2025
a7c09b5
CVE-2007-4559 Add NEWS entry
rickprice Jan 20, 2025
0220b82
Merge pull request #68 from ActiveState/BE-3659-cve-2007-4559-IIII
icanhasmath Jan 21, 2025
a8922cf
Add news entry for CVE-2024-6232
rickprice Jan 21, 2025
00c24da
BE-4504 Fix minidom test to use newer expat
rickprice Jan 22, 2025
3737df1
BE-4504 Test uses invalid IRI, Expat is now stricter presumably, removed
rickprice Jan 22, 2025
9a39a8c
Merge pull request #69 from ActiveState/BE-4504-python-2-7-expat-upda…
icanhasmath Jan 22, 2025
baeeb8d
BE-4012 Increment release version
rickprice Jan 22, 2025
a22a1d8
Merge pull request #70 from ActiveState/BE-4012-python-2-7-18-11
icanhasmath Jan 23, 2025
8cc4686
Be 5270 CVE 2023 27043 python 2 7 (#73)
rickprice Mar 14, 2025
ef22247
BE-5248 New AS Release 2.7.18.12
rickprice Apr 30, 2025
c98f6b9
Refactor CVE-2023-27043 patch to support Unicode characters
ezequielp-activestate Mar 6, 2026
4d53926
remove comment
ezequielp-activestate Mar 6, 2026
20b5521
2.7.18.13 Release
ezequielp-activestate Mar 6, 2026
c9092b8
Add better tests for CVE 2023-27043
ezequielp-activestate Mar 13, 2026
25093f5
Patch `getaddresses` to support unicode strings
ezequielp-activestate Mar 13, 2026
b889bb2
Address CVE-2025-8194 (tarfile) and CVE-2026-4786 (webbrowser)
icanhasmath May 27, 2026
b6bc4c9
Reject control characters in header/command APIs (injection cluster)
icanhasmath May 27, 2026
bf72fb1
Reject header injection when generating email messages (CVE-2024-6923)
icanhasmath May 27, 2026
bfe0766
Harden zipfile against overlapping entries and bad ZIP64 locator
icanhasmath May 27, 2026
6bb4e3f
Reject misplaced square brackets in parsed URL hosts (CVE-2025-0938)
icanhasmath May 27, 2026
7a18f2e
Fix quadratic complexity in minidom and os.path.expandvars
icanhasmath May 27, 2026
db4f9c0
Fix quadratic complexity in HTMLParser at EOF (CVE-2025-6069)
icanhasmath May 27, 2026
b477d48
Add strict validation option to base64.b64decode
icanhasmath May 27, 2026
7de584c
Don't normalize AREGTYPE follow-up headers to DIRTYPE (CVE-2025-13462)
icanhasmath May 27, 2026
e477d8b
2.7.18.14 Release
icanhasmath May 28, 2026
bc3b091
fix(tests): skip-guard expandvars nonascii test; restore _have_socket…
icanhasmath May 29, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -87,3 +87,4 @@ coverage/
externals/
htmlcov/
gmon.out
.aider*
12 changes: 10 additions & 2 deletions Doc/library/email.utils.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,18 @@ There are several useful utilities provided in the :mod:`email.utils` module:
begins with angle brackets, they are stripped off.


.. function:: parseaddr(address)
.. function:: parseaddr(address, strict=True)

Parse address -- which should be the value of some address-containing field such
as :mailheader:`To` or :mailheader:`Cc` -- into its constituent *realname* and
*email address* parts. Returns a tuple of that information, unless the parse
fails, in which case a 2-tuple of ``('', '')`` is returned.

If *strict* is true, use a strict parser which rejects malformed inputs.

.. versionchanged:: 2.7.18.12
Add *strict* optional parameter and reject malformed inputs by default.


.. function:: formataddr(pair)

Expand All @@ -37,7 +42,7 @@ There are several useful utilities provided in the :mod:`email.utils` module:
second element is returned unmodified.


.. function:: getaddresses(fieldvalues)
.. function:: getaddresses(fieldvalues, strict=True)

This method returns a list of 2-tuples of the form returned by ``parseaddr()``.
*fieldvalues* is a sequence of header field values as might be returned by
Expand All @@ -52,6 +57,9 @@ There are several useful utilities provided in the :mod:`email.utils` module:
resent_ccs = msg.get_all('resent-cc', [])
all_recipients = getaddresses(tos + ccs + resent_tos + resent_ccs)

.. versionchanged:: 2.7.18.12
Add *strict* optional parameter and reject malformed inputs by default.


.. function:: parsedate(date)

Expand Down
17 changes: 17 additions & 0 deletions Doc/whatsnew/2.7.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2793,3 +2793,20 @@ The author would like to thank the following people for offering
suggestions, corrections and assistance with various drafts of this
article: Nick Coghlan, Philip Jenvey, Ryan Lovett, R. David Murray,
Hugh Secker-Walker.


Notable changes in 3.8.20
=========================

email
-----

* :func:`email.utils.getaddresses` and :func:`email.utils.parseaddr` now return
``('', '')`` 2-tuples in more situations where invalid email addresses are
encountered, instead of potentially inaccurate values.
An optional *strict* parameter was added to these two functions:
use ``strict=False`` to get the old behavior, accepting malformed inputs.
``getattr(email.utils, 'supports_strict_parsing', False)`` can be used to
check if the *strict* paramater is available.
(Contributed by Thomas Dwyer and Victor Stinner for :gh:`102988` to improve
the CVE-2023-27043 fix.)
2 changes: 1 addition & 1 deletion Include/patchlevel.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
#define PY_RELEASE_SERIAL 0

/* Version as a string */
#define PY_VERSION "2.7.18.10"
#define PY_VERSION "2.7.18.14"
/*--end constants--*/

/* Subversion Revision number of this file (not of the repository). Empty
Expand Down
23 changes: 19 additions & 4 deletions Lib/Cookie.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,13 +92,14 @@
'Set-Cookie: chips=ahoy\r\nSet-Cookie: vienna=finger'

The load() method is darn-tootin smart about identifying cookies
within a string. Escaped quotation marks, nested semicolons, and other
such trickeries do not confuse it.
within a string. Escaped quotation marks and nested semicolons do not
confuse it. (Note that cookies whose values contain control characters
are now rejected to prevent Set-Cookie header injection; CVE-2026-0672.)

>>> C = Cookie.SmartCookie()
>>> C.load('keebler="E=everybody; L=\\"Loves\\"; fudge=\\012;";')
>>> C.load('keebler="E=everybody; L=\\"Loves\\"; fudge=delicious;";')
>>> print C
Set-Cookie: keebler="E=everybody; L=\"Loves\"; fudge=\012;"
Set-Cookie: keebler="E=everybody; L=\"Loves\"; fudge=delicious;"

Each element of the Cookie also supports all of the RFC 2109
Cookie attributes. Here's an example which sets the Path
Expand Down Expand Up @@ -242,6 +243,15 @@ class CookieError(Exception):
# _Translator hash-table for fast quoting
#
_LegalChars = string.ascii_letters + string.digits + "!#$%&'*+-.^_`|~"
_control_character_re = re.compile(r'[\x00-\x1f\x7f]')

def _has_control_character(*values):
"""Return True if any of the given string values holds a control char."""
for v in values:
if isinstance(v, basestring) and _control_character_re.search(v):
return True
return False

_Translator = {
'\000' : '\\000', '\001' : '\\001', '\002' : '\\002',
'\003' : '\\003', '\004' : '\\004', '\005' : '\\005',
Expand Down Expand Up @@ -424,6 +434,8 @@ def __setitem__(self, K, V):
K = K.lower()
if not K in self._reserved:
raise CookieError("Invalid Attribute %s" % K)
if _has_control_character(K, V):
raise CookieError("Control characters are not allowed in cookies: %r %r" % (K, V))
dict.__setitem__(self, K, V)
# end __setitem__

Expand All @@ -440,6 +452,9 @@ def set(self, key, val, coded_val,
raise CookieError("Attempt to set a reserved key: %s" % key)
if "" != translate(key, idmap, LegalChars):
raise CookieError("Illegal key value: %s" % key)
if _has_control_character(key, val, coded_val):
raise CookieError("Control characters are not allowed in cookies: %r %r %r"
% (key, val, coded_val))

# It's a good key, so save it.
self.key = key
Expand Down
41 changes: 33 additions & 8 deletions Lib/HTMLParser.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
charref = re.compile('&#(?:[0-9]+|[xX][0-9a-fA-F]+)[^0-9a-fA-F]')

starttagopen = re.compile('<[a-zA-Z]')
endtagopen = re.compile('</[a-zA-Z]')
piclose = re.compile('>')
commentclose = re.compile(r'--\s*>')

Expand Down Expand Up @@ -167,22 +168,46 @@ def goahead(self, end):
k = self.parse_pi(i)
elif startswith("<!", i):
k = self.parse_html_declaration(i)
elif (i + 1) < n:
elif (i + 1) < n or end:
self.handle_data("<")
k = i + 1
else:
break
if k < 0:
if not end:
break
k = rawdata.find('>', i + 1)
if k < 0:
k = rawdata.find('<', i + 1)
if k < 0:
k = i + 1
# End of input with an unterminated construct. Close it
# per HTML5 instead of rescanning, which made repeated
# incomplete constructs quadratic (CVE-2025-6069).
if starttagopen.match(rawdata, i): # < + letter
pass
elif startswith("</", i):
if i + 2 == n:
self.handle_data("</")
elif endtagopen.match(rawdata, i): # </ + letter
pass
else:
# bogus comment
self.handle_comment(rawdata[i+2:])
elif startswith("<!--", i):
j = n
for suffix in ("--!", "--", "-"):
if rawdata.endswith(suffix, i+4):
j -= len(suffix)
break
self.handle_comment(rawdata[i+4:j])
elif startswith("<![CDATA[", i):
self.unknown_decl(rawdata[i+3:])
elif rawdata[i:i+9].lower() == '<!doctype':
self.handle_decl(rawdata[i+2:])
elif startswith("<!", i):
# bogus comment
self.handle_comment(rawdata[i+2:])
elif startswith("<?", i):
self.handle_pi(rawdata[i+2:])
else:
k += 1
self.handle_data(rawdata[i:k])
raise AssertionError("we should not get here!")
k = n
i = self.updatepos(i, k)
elif startswith("&#", i):
match = charref.match(rawdata, i)
Expand Down
27 changes: 23 additions & 4 deletions Lib/base64.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,37 @@ def b64encode(s, altchars=None):
return encoded


def b64decode(s, altchars=None):
def b64decode(s, altchars=None, validate=False):
"""Decode a Base64 encoded string.

s is the string to decode. Optional altchars must be a string of at least
length 2 (additional characters are ignored) which specifies the
alternative alphabet used instead of the '+' and '/' characters.

The decoded string is returned. A TypeError is raised if s is
incorrectly padded. Characters that are neither in the normal base-64
alphabet nor the alternative alphabet are discarded prior to the padding
check.
incorrectly padded.

If validate is False (the default), characters that are neither in the
normal base-64 alphabet nor the alternative alphabet are discarded prior
to the padding check. If validate is True, these non-alphabet characters
in the input result in a binascii.Error.

Unlike upstream (which only deprecates the lenient behaviour), validation
here checks the input against the *requested* alphabet, so the standard
'+'/'/' characters are rejected when an alternative alphabet is given
(CVE-2025-12781), and any data after the padding is rejected rather than
silently ignored (CVE-2026-3446).
"""
if validate:
if altchars is not None:
extra = altchars[:2]
else:
extra = b'+/'
valid = frozenset(string.ascii_letters + string.digits + extra)
stripped = s.rstrip(b'=')
npad = len(s) - len(stripped)
if npad > 2 or not all(c in valid for c in stripped):
raise binascii.Error('Non-base64 digit found')
if altchars is not None:
s = s.translate(string.maketrans(altchars[:2], '+/'))
try:
Expand Down
4 changes: 4 additions & 0 deletions Lib/email/errors.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ class CharsetError(MessageError):
"""An illegal charset was given."""


class HeaderWriteError(MessageError):
"""Error while writing headers."""



# These are parsing defects which the parser was able to work around.
class MessageDefect:
Expand Down
22 changes: 17 additions & 5 deletions Lib/email/generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,12 @@

from cStringIO import StringIO
from email.header import Header
from email.errors import HeaderWriteError

# Matches a CR/LF that is NOT part of a valid header folding (i.e. not
# immediately followed by folding whitespace). Used to detect injected
# newlines in generated headers (CVE-2024-6923).
NEWLINE_WITHOUT_FWSP = re.compile(r'\r\n[^ \t]|\r[^ \n\t]|\n[^ \t]')

UNDERSCORE = '_'
NL = '\n'
Expand Down Expand Up @@ -139,29 +145,35 @@ def _dispatch(self, msg):

def _write_headers(self, msg):
for h, v in msg.items():
print >> self._fp, '%s:' % h,
if self._maxheaderlen == 0:
# Explicit no-wrapping
print >> self._fp, v
value = v
elif isinstance(v, Header):
# Header instances know what to do
print >> self._fp, v.encode()
value = v.encode()
elif _is8bitstring(v):
# If we have raw 8bit data in a byte string, we have no idea
# what the encoding is. There is no safe way to split this
# string. If it's ascii-subset, then we could do a normal
# ascii split, but if it's multibyte then we could break the
# string. There's no way to know so the least harm seems to
# be to not split the string and risk it being too long.
print >> self._fp, v
value = v
else:
# Header's got lots of smarts, so use it. Note that this is
# fundamentally broken though because we lose idempotency when
# the header string is continued with tabs. It will now be
# continued with spaces. This was reversedly broken before we
# fixed bug 1974. Either way, we lose.
print >> self._fp, Header(
value = Header(
v, maxlinelen=self._maxheaderlen, header_name=h).encode()
# Reject headers that contain an injected newline, i.e. a CR/LF
# that is not part of valid header folding (CVE-2024-6923).
folded = '%s: %s' % (h, value)
if NEWLINE_WITHOUT_FWSP.search(folded):
raise HeaderWriteError(
"header value contains an unexpected newline: %r" % (folded,))
print >> self._fp, folded
# A blank line always separates headers from body
print >> self._fp

Expand Down
58 changes: 55 additions & 3 deletions Lib/email/test/test_email.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,26 @@ def _msgobj(self, filename):

# Test various aspects of the Message class's API
class TestMessageAPI(TestEmailBase):
def test_string_rejects_header_injection(self):
# CVE-2024-6923: generating a message must reject header values that
# contain an injected newline (one not part of valid folding). The
# no-wrap path writes the value verbatim, which deterministically
# exercises the check.
import email.errors
from email.generator import Generator
from cStringIO import StringIO
for bad in ('value\r\nInjected: header',
'value\nInjected: header',
'value\rstuff'):
msg = Message()
msg['Subject'] = bad
g = Generator(StringIO(), maxheaderlen=0)
self.assertRaises(email.errors.HeaderWriteError, g.flatten, msg)
# A normal header is still emitted fine.
msg = Message()
msg['Subject'] = 'a normal subject that is reasonably short'
self.assertIn('Subject: a normal subject', msg.as_string())

def test_get_all(self):
eq = self.assertEqual
msg = self._msgobj('msg_20.txt')
Expand Down Expand Up @@ -2320,6 +2340,22 @@ def test_parseaddr_multiple_domains(self):
('', '')
)

def test_parseaddr_unicode(self):
"""Test parseaddr with unicode strings"""

test_cases = [
u'user@example.com',
u'Test User <user@example.com>',
u'"Test User" <user@example.com>',
]

for addr in test_cases:
result = Utils.parseaddr(addr, strict=True)
self.assertNotEqual(result, ('', ''))

result_non_strict = Utils.parseaddr(addr, strict=False)
self.assertEqual(result, result_non_strict)

def test_noquote_dump(self):
self.assertEqual(
Utils.formataddr(('A Silly Person', 'person@dom.ain')),
Expand Down Expand Up @@ -2417,9 +2453,11 @@ def test_getaddresses(self):
def test_getaddresses_nasty(self):
eq = self.assertEqual
eq(Utils.getaddresses(['foo: ;']), [('', '')])
eq(Utils.getaddresses(
['[]*-- =~$']),
[('', ''), ('', ''), ('', '*--')])
addresses = ['[]*-- =~$']
eq(Utils.getaddresses(addresses),
[('', '')])
eq(Utils.getaddresses(addresses, strict=False),
[('', ''), ('', ''), ('', '*--')])
eq(Utils.getaddresses(
['foo: ;', '"Jason R. Mastaler" <jason@dom.ain>']),
[('', ''), ('Jason R. Mastaler', 'jason@dom.ain')])
Expand All @@ -2430,6 +2468,20 @@ def test_getaddresses_embedded_comment(self):
addrs = Utils.getaddresses(['User ((nested comment)) <foo@bar.com>'])
eq(addrs[0][1], 'foo@bar.com')

def test_getaddresses_unicode(self):
"""Test getaddresses with unicode strings in Python 2"""

test_cases = [
([u'user@example.com'], [('', 'user@example.com')]),
([u'Test User <user@example.com>'], [('Test User', 'user@example.com')]),
([u'"Test User" <user@example.com>'], [('Test User', 'user@example.com')]),
([u'user1@example.com', u'user2@example.com'], [('', 'user1@example.com'), ('', 'user2@example.com')]),
]

for addrs, expected in test_cases:
result = Utils.getaddresses(addrs)
self.assertEqual(result, expected)

def test_make_msgid_collisions(self):
# Test make_msgid uniqueness, even with multiple threads
class MsgidsThread(Thread):
Expand Down
Loading