Skip to content

Utilities

Overview

From time to time, I usually read internal codebase of public repository, then I come up with some best practice overthere that help me a lot in the time of coding.

Code

To avoid de-duplicated on search, using set for list

Pattern

pool = set() # Define empty set
for element in basket:
    ...
    if <condition>: # Defile condition matching
        pool.update(element)
    ...

# Then we can convert to list or yield from set
for runner in pool:
    ... do something
    ...

# or convert back to list
pool: list = list(pool)

Example:

def _list_folders(self, bkt, prefix, artifact_path):
    results = bkt.list_blobs(prefix=prefix, delimiter="/")
    dir_paths = set()
    for page in results.pages:
        dir_paths.update(page.prefixes)

ref: https://github.com/mlflow/mlflow/blob/fe6434603499bc8e420b75492aa1c0ba64064f39/mlflow/store/artifact/gcs_artifact_repo.py#L30

Validate python version

Valiudate specific version in application. The snippet can put in __init__.py or entrypoint script.

```py indent=4 title="init.py"

!/bin/python3

Gloabal

import sys

Declare

required_major, required_minor, _ = "3.9.x".split(".")

Extract

major, minor, micro, releaselevel, serial = sys.version_info

Required

if not all([major >= int(required_major), minor >= int(required_minor)]): raise RuntimeError( f"Python version required to >= {required_major}.{required_minor}.x release. " f"Got Python version {'.'.join([str(x) for x in sys.version_info])}" )

### Detect the system platform identifier

From `sys` library, enumuration of system platforms:

Table 1: System Platform Encoding

| Platform       | Identifier |
| -------------- | ---------- |
| AIX            | 'aix'      |
| Linux          | 'linux'    |
| Windows        | 'win32'    |
| Windows/Cygwin | 'cygwin'   |
| macOS          | 'darwin'   |

```py
import sys

if sys.platform == "linux":
  ...
elif sys.platform.starswith("cy"):
  ...

Using tempfile to temporary download file

The temporary is very useful to be an transition place when download file

Compare with the general way

Type General Using tempfile
Manage directory Required code to establish the folder No,

tempfile.gettempdir()

Return the name of the directory used for temporary files. This defines the default value for the dir argument to all functions in this module.

Python searches a standard list of directories to find one which the calling user can create files in. The list is:

The directory named by the TMPDIR environment variable.

The directory named by the TEMP environment variable.

The directory named by the TMP environment variable.

import tempfile

# Get tempfile
tempfile.gettempdir()
# >>> 'C:\\Users\\<user-name>\\\\AppData\\Local\\Temp'

A platform-specific location:

On Windows, the directories C:\TEMP, C:\TMP, \TEMP, and \TMP, in that order.

On all other platforms, the directories /tmp, /var/tmp, and /usr/tmp, in that order.

As a last resort, the current working directory.

The result of this search is cached, see the description of tempdir below.

Reference