Overview

The Metadata class can be used in order to organize lot’s of Buffer files. For example you might have a project generated by the Analyzer4D Software and you want to search through all buffer files in order to find all buffers with a certain compression. You could go through each file and check out if the compression matches your requirement which takes a lot of time. The cache creates a database that caches all the metadata available in the qass.tools.analyzer.buffer_parser.Buffer class. You can then query the cache very fast and programmatically in Python!

Example

In this example we create an instance of the cache and synchronize it with the directory my/directory. We then create a template BufferMetadata object and use it with the cache to query for all buffers with a compression_frq = 8. You can use all properties that are in BufferMetadata.properties. The BufferMetadataCache.get_matching_buffers() method returns a list of Buffer objects that are in this case sorted by their process number (as specified in the query).

1from qass.tools.analyzer.buffer_metadata_cache import BufferMetadataCache as BMC, BufferMetadata as BM, select
2
3cache = BMC()
4cache.synchronize_directory("my/directory")
5
6results = cache.get_matching_buffers(select(BM).filter(BM.compression_frq==8).order_by(BM.process))

BufferMetadataCache

class qass.tools.analyzer.buffer_metadata_cache.BufferMetadataCache(db_url='sqlite:///:memory:', Buffer_cls=<class 'qass.tools.analyzer.buffer_parser.Buffer'>)[source]

This class acts as a Cache for Buffer Metadata. It uses a database session with a buffer_metadata table to map metadata to files on the disk. The cache can be queried a lot faster than manually opening a lot of buffer files.

class BufferMetadata(**kwargs)

This class acts as a template for buffer files. It’s properties represent all available metadata of a buffer file. This class is used internally as a database model and can be instantiated to provide a template for a buffer file by populating desired properties and passing the object to the cache which will in turn create a query based on this object.

static buffer_to_metadata(buffer)

Converts a Buffer object to a BufferMetadata database object by copying all the @properties from the Buffer object putting them in the BufferMetadata object

Parameters:

buffer (buffer_parser.Buffer) – Buffer object

add_files_to_cache(files: Sequence[Path], verbose: int = 0, batch_size: int = 1000, machine_id: str | None = None, check_synced: bool = True)[source]

Add buffer files to the cache by providing the complete filepaths If a file (determined by filename and directory) is already synchronized it will be skipped

Parameters:
  • files (list, tuple of str) – complete filepaths that are added to the cache. The filepath is used with the Buffer class to open a buffer and extract the header information.

  • verbose (int, optional) – verbosity level. 0 = no feedback, 1 = progress bar

  • batch_size (int, optional) – The batch size after which the cache will commit a batch to the database

  • machine_id (str, optional) – A unique identifier for a different machine

  • check_synced (bool, optional) – Whether to check if the files that are about to be added are already synced to the cache

get_buffer_metadata_query(buffer_metadata)[source]

Converts a .. py:class:: BufferMetadata object to a complete query. Every property of the object will be converted into SQL and returned as a ..py:class:: sqlalchemy.orm.query.FromStatement object

Parameters:

buffer_metadata (BufferMetadata) – The template BufferMetadata object.

Returns:

The sqlalchemy query object

Return type:

sqlalchemy.orm.query.FromStatement

get_matching_buffers(query: Select) List[Buffer][source]

Calls get_matching_files and converts the result to Buffer objects

Returns:

List of Buffer objects

Return type:

list

get_matching_files(query: Select) List[str][source]

Query the Cache for all files matching the properties that selected by the query object. The usage of the buffer_metadata, filter_functions and sort_key is deprecated and will be removed in two minor versions. Use the sqlalchemy query parameter instead.

1BufferMetadataCache.get_matching_files(
2    select(BM).filter(BM.channel==1, BM.compression_freq==4, BM.process > 100)
3)
4# Returns all buffer filepaths with channel = 1, A frequency compression of 4,
5# processes above 100 sorted by the process number
Parameters:

query (Select) – A sqlalchemy select statement specifying the properties of the BufferMetadata objects

Returns:

A list with the paths to the buffer files that match the buffer_metadata

Return type:

list[str]

get_matching_metadata(query: Select) List[BufferMetadata][source]

Query the cache for all BufferMetadata database entries matching

Parameters:

query (Select) – A sqlalchemy select statement specifying the properties of the BufferMetadata objects

Returns:

A list with the matching BufferMetadata objects

Return type:

list[BufferMetadata]

get_non_synchronized_files(files: Sequence[Path], machine_id: str | None = None) Tuple[List[Path], List[Path]][source]

calculate the difference between the set of files and the set of synchronized files

Parameters:
  • files (Sequence[Path]) – filenames

  • machine_id (Union[str, None]) – machine identifier

Returns:

The set of files that are not synchronized, and the database entries that exist but the file is not present anymore

remove_files_from_cache(files: List[Path], verbose=0)[source]

Remove synchronized files from the cache

Parameters:
  • files (list, tuple of Path) – complete filepaths that are present in the cache

  • verbose (int, optional) – verbosity level. 0 = no feedback, 1 = progress bar

synchronize_directory(*paths, sync_subdirectories=True, regex_pattern='^.*[p][0-9]*[c][0-9]{1}[b][0-9]{2}', verbose=1, delete_stale_entries=False, machine_id=None, glob_pattern='*p*c?b*')[source]

synchronize the buffer files in the given paths with the database matching the regex pattern

Parameters:
  • paths (str) – The absolute paths to the directory

  • recursive (bool, optional) – When True synchronize all of the subdirectories recursively, defaults to True

  • regex_pattern (string, optional) – The regex pattern validating the buffer naming format (matched on file.name)

  • verbose (int, optional) – verbosity level. 0 = no feedback, 1 = progress bar

  • machine_id (string, optional) – An optional identifier for a certain machine to enable synchronization of different platforms

  • glob_pattern (string, optional) – The pattern forwarded to Path.glob. This pattern acts as a preselection for the files retrieved for the regex pattern

BufferMetadata

class qass.tools.analyzer.buffer_metadata_cache.BufferMetadata(**kwargs)[source]

This class acts as a template for buffer files. It’s properties represent all available metadata of a buffer file. This class is used internally as a database model and can be instantiated to provide a template for a buffer file by populating desired properties and passing the object to the cache which will in turn create a query based on this object.

static buffer_to_metadata(buffer)[source]

Converts a Buffer object to a BufferMetadata database object by copying all the @properties from the Buffer object putting them in the BufferMetadata object

Parameters:

buffer (buffer_parser.Buffer) – Buffer object

qass.tools.analyzer.buffer_metadata_cache.get_declarative_base()[source]

Getter for the declarative Base that is used by the BufferMetadataCache.

Returns:

declarative base class