Core Functions

Notes

  • Strings, filehandles, and regexes passed to the *_preceded, *_separated, and *_terminated functions may be either binary or text. However, the arguments to a single invocation of a function must be either all binary or all text, and the return type will match.

  • Note the following about how the different types of separators are handled at the beginning & end of input:

    • When segments are terminated by a given separator, a separator at the beginning of the input creates an empty leading segment, and a separator at the end of the input simply terminates the last segment.

    • When segments are separated by a given separator, a separator at the beginning of the input creates an empty leading segment, and a separator at the end of the input creates an empty trailing segment.

    • When segments are preceded by a given separator, a separator at the beginning of the input simply starts the first segment, and a separator at the end of the input creates an empty trailing segment.

  • Two adjacent separators always create an empty segment between them, unless the separator is a regex that spans both separators at once.

Splitting Strings

linesep.split_preceded(s: AnyStr, sep: Union[AnyStr, Pattern], retain: bool = False) list[AnyStr][source]

Split a string s into zero or more segments starting with/preceded by the string or compiled regex sep. A list of segments is returned; an empty input string will always produce an empty list.

Parameters
  • s – a binary or text string

  • sep – a string or compiled regex that indicates the start of a new segment wherever it occurs

  • retain (bool) – whether to include the separators at the beginning of each segment

Returns

a list of the segments in s

Return type

list of binary or text strings

linesep.split_separated(s: AnyStr, sep: Union[AnyStr, Pattern], retain: bool = False) list[AnyStr][source]

Split a string s into one or more segments separated by the string or compiled regex sep. A list of segments is returned; an empty input string will always produce a list with one element, the empty string.

Parameters
  • s – a binary or text string

  • sep – a string or compiled regex that indicates the end of one segment and the beginning of another wherever it occurs

  • retain (bool) – When True, the segment separators will be included in the output, with the elements of the list alternating between segments and separators, starting with a (possibly empty) segment

Returns

a list of the segments in s

Return type

list of binary or text strings

linesep.split_terminated(s: AnyStr, sep: Union[AnyStr, Pattern], retain: bool = False) list[AnyStr][source]

Split a string s into zero or more segments terminated by the string or compiled regex sep. A list of segments is returned; an empty input string will always produce an empty list.

Parameters
  • s – a binary or text string

  • sep – a string or compiled regex that indicates the end of a segment wherever it occurs

  • retain (bool) – whether to include the separators at the end of each segment

Returns

a list of the segments in s

Return type

list of binary or text strings

Joining Strings

linesep.join_preceded(iterable: Iterable, sep: AnyStr) AnyStr[source]

Join the elements of iterable together, preceding each one with sep

Parameters
  • iterable – an iterable of binary or text strings

  • sep – a binary or text string

Return type

a binary or text string

linesep.join_separated(iterable: Iterable, sep: AnyStr) AnyStr[source]

Join the elements of iterable together, separating consecutive elements with sep

Parameters
  • iterable – an iterable of binary or text strings

  • sep – a binary or text string

Return type

a binary or text string

linesep.join_terminated(iterable: Iterable, sep: AnyStr) AnyStr[source]

Join the elements of iterable together, appending sep to each one

Parameters
  • iterable – an iterable of binary or text strings

  • sep – a binary or text string

Return type

a binary or text string

Reading from Filehandles

Warning

Using the read_* functions with a variable-length regular expression is unreliable. The only truly foolproof way to split on such regexes is to first read the whole file into memory and then call one of the split_* functions. As a result, passing a regular expression separator to a read_* function is deprecated starting in version 0.4.0, and support for this will be removed in version 1.0.

linesep.read_preceded(fp: IO, sep: Union[AnyStr, Pattern], retain: bool = False, chunk_size: int = 512) Iterator[source]

Read segments from a file-like object fp in which the beginning of each segment is indicated by the string or compiled regex sep. A generator of segments is returned; an empty file will always produce an empty generator.

Data is read from the filehandle chunk_size characters at a time. If sep is a variable-length compiled regex and a separator in the file crosses a chunk boundary, the results are undefined.

Deprecated since version 0.4.0: Passing a regular expression as a separator is deprecated, and support will be removed in version 1.0.

Parameters
  • fp – a binary or text file-like object

  • sep – a string or compiled regex that indicates the start of a new segment wherever it occurs

  • retain (bool) – whether to include the separators at the beginning of each segment

  • chunk_size (int) – how many bytes or characters to read from fp at a time

Returns

a generator of the segments in fp

Return type

generator of binary or text strings

linesep.read_separated(fp: IO, sep: Union[AnyStr, Pattern], retain: bool = False, chunk_size: int = 512) Iterator[source]

Read segments from a file-like object fp in which segments are separated by the string or compiled regex sep. A generator of segments is returned; an empty file will always produce a generator with one element, the empty string.

Data is read from the filehandle chunk_size characters at a time. If sep is a variable-length compiled regex and a separator in the file crosses a chunk boundary, the results are undefined.

Deprecated since version 0.4.0: Passing a regular expression as a separator is deprecated, and support will be removed in version 1.0.

Parameters
  • fp – a binary or text file-like object

  • sep – a string or compiled regex that indicates the end of one segment and the beginning of another wherever it occurs

  • retain (bool) – When True, the segment separators will be included in the output, with the elements of the generator alternating between segments and separators, starting with a (possibly empty) segment

  • chunk_size (int) – how many bytes or characters to read from fp at a time

Returns

a generator of the segments in fp

Return type

generator of binary or text strings

linesep.read_terminated(fp: IO, sep: Union[AnyStr, Pattern], retain: bool = False, chunk_size: int = 512) Iterator[source]

Read segments from a file-like object fp in which the end of each segment is indicated by the string or compiled regex sep. A generator of segments is returned; an empty file will always produce an empty generator.

Data is read from the filehandle chunk_size characters at a time. If sep is a variable-length compiled regex and a separator in the file crosses a chunk boundary, the results are undefined.

Deprecated since version 0.4.0: Passing a regular expression as a separator is deprecated, and support will be removed in version 1.0.

Parameters
  • fp – a binary or text file-like object

  • sep – a string or compiled regex that indicates the end of a segment wherever it occurs

  • retain (bool) – whether to include the separators at the end of each segment

  • chunk_size (int) – how many bytes or characters to read from fp at a time

Returns

a generator of the segments in fp

Return type

generator of binary or text strings

Writing to Filehandles

linesep.write_preceded(fp: IO, iterable: Iterable, sep: AnyStr) None[source]

Write the elements of iterable to the filehandle fp, preceding each one with sep

Parameters
  • fp – a binary or text file-like object

  • iterable – an iterable of binary or text strings

  • sep – a binary or text string

Returns

None

linesep.write_separated(fp: IO, iterable: Iterable, sep: AnyStr) None[source]

Write the elements of iterable to the filehandle fp, separating consecutive elements with sep

Parameters
  • fp – a binary or text file-like object

  • iterable – an iterable of binary or text strings

  • sep – a binary or text string

Returns

None

linesep.write_terminated(fp: IO, iterable: Iterable, sep: AnyStr) None[source]

Write the elements of iterable to the filehandle fp, appending sep to each one

Parameters
  • fp – a binary or text file-like object

  • iterable – an iterable of binary or text strings

  • sep – a binary or text string

Returns

None