granicus_archiver.googledrive.client

exception granicus_archiver.googledrive.client.UploadError[source]

Bases: Exception

Exception raised for errors during upload

exception granicus_archiver.googledrive.client.RateLimitError(msg, req, res: Response)[source]

Bases: HTTPError

Exception raised when a rate limit is encountered

Parameters:

res (Response)

class granicus_archiver.googledrive.client.SchedulersTD[source]

Bases: TypedDict

class granicus_archiver.googledrive.client.GoogleClient(root_conf: Config)[source]

Bases: object

Thin wrapper for aiogoogle.client.Aiogoogle to simplify operations

Parameters:

root_conf (Config)

aiogoogle: Aiogoogle

The actual google client

drive_v3: DriveResource

Drive resource

folder_cache: dict[Path, FileId]

A cache of Drive folders and their id

meta_cache: FileCache

Cache for uploaded file metadata

escape_filename(filename: str | Path) str[source]

Escape filenames for use within quoted portions of api queries

Parameters:

filename (str | Path)

Return type:

str

save_cache()[source]

Save folder_cache and meta_cache to disk

find_folder_cache(folder: Path) tuple[list[FileId], Path] | None[source]

Search for cached Drive folders previously found by find_folder()

Returns the longest matching path and id (similar to find_folder())

Parameters:

folder (Path)

Return type:

tuple[list[FileId], Path] | None

cache_folder_parts(*parts_and_ids: tuple[str, FileId])[source]

Store Drive folder ids in the folder_cache

Each argument should be a tuple of the folder name and its id (for each folder level)

Parameters:

parts_and_ids (tuple[str, FileId])

cache_single_folder(folder: Path, f_id: FileId)[source]

Store a single Drive folder id in the folder_cache

Parameters:
find_cached_file_meta(key: tuple[Literal['clips'], CLIP_ID, Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet']] | tuple[Literal['legistar'], GUID, LegistarFileUID] | tuple[Literal['legistar_rguid'], REAL_GUID, LegistarFileUID]) DriveFileMetaFull | None[source]

Search the cache for metadata by model.Clip.id and file type

Parameters:

key (tuple[Literal['clips'], ~granicus_archiver.clips.model.CLIP_ID, ~typing.Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet']] | tuple[~typing.Literal['legistar'], ~granicus_archiver.legistar.types.GUID, ~granicus_archiver.legistar.types.LegistarFileUID] | tuple[~typing.Literal['legistar_rguid'], ~granicus_archiver.legistar.types.REAL_GUID, ~granicus_archiver.legistar.types.LegistarFileUID])

Return type:

DriveFileMetaFull | None

set_cached_file_meta(key: tuple[Literal['clips'], CLIP_ID, Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet']] | tuple[Literal['legistar'], GUID, LegistarFileUID] | tuple[Literal['legistar_rguid'], REAL_GUID, LegistarFileUID], meta: DriveFileMetaFull) None[source]

Store metadata for the model.Clip.id and file type in the cache

Parameters:
  • key (tuple[Literal['clips'], ~granicus_archiver.clips.model.CLIP_ID, ~typing.Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet']] | tuple[~typing.Literal['legistar'], ~granicus_archiver.legistar.types.GUID, ~granicus_archiver.legistar.types.LegistarFileUID] | tuple[~typing.Literal['legistar_rguid'], ~granicus_archiver.legistar.types.REAL_GUID, ~granicus_archiver.legistar.types.LegistarFileUID])

  • meta (DriveFileMetaFull)

Return type:

None

async as_user(*requests, resp_type: type[_Rt], full_res: bool = False) _Rt[source]

Send requests using aiogoogle.client.Aiogoogle.as_user() casting their responses as resp_type

Parameters:
Return type:

_Rt

async list_files(q: str, spaces: str = 'drive', fields: list[str] | str | None = None, full_res: bool = True) AsyncGenerator[DriveFileMetaFull, None][source]
async list_files(q: str, spaces: str = 'drive', fields: list[str] | str | None = None, full_res: bool = False) AsyncGenerator[DriveFileMeta, None]

List files using the given query string

The result will be an asynchronous generator of DriveFileMeta (if full_res is False) or DriveFileMetaFull (if full_res is True).

async find_folder(folder: Path) tuple[FileId, Path] | None[source]

Find a (possibly nested) Drive folder

Searches each part of the given folder for a Drive folder with a matching name, then recursively searches for each sub-folder.

Returns the longest matching path and the id for each part of that path. If the root of the given folder was not found, returns None

Parameters:

folder (Path)

Return type:

tuple[FileId, Path] | None

async create_folder(name: str, parent: FileId | None) FileId[source]

Create a Drive folder with the given name

If parent is given, it should be a valid folder id to use as a parent folder

Parameters:
Return type:

FileId

async create_folder_nested(*parts: str, parent: FileId | None, parent_path: Path | None = None, use_cache: bool = False) FileId[source]

Create a nested group of folders using the parts for each name

Parameters:
  • *parts (str) – The folder name for each directory level

  • parent (FileId | None) – The id of the folder to create the nested folders in

  • parent_path (Path | None) – The root path to use if use_cache is True

  • use_cache (bool) – If True, stores the results in the folder_cache. parent_path must be provided to accurately store the results

Return type:

FileId

async create_folder_from_path(folder: Path) FileId[source]

Find or create a (possibly nested) Drive folder with the given path

Parameters:

folder (Path)

Return type:

FileId

async get_file_meta(filename: Path | FileId) DriveFileMetaFull | None[source]

Get metadata for the given file (if it exists)

Parameters:

filename (Path | FileId)

Return type:

DriveFileMetaFull | None

async stream_upload_file(local_file: Path, upload_filename: Path, check_exists: bool = True, folder_id: FileId | None = None, check_hash: bool = True, timeout_total: float | None = None, timeout_chunk: float | None = None) DriveFileMetaFull | None[source]

Upload a file to Drive

Parameters:
  • local_file (Path) – The local filename

  • upload_filename (Path) – Relative path for the uploaded file

  • check_exists (bool) – Whether to check if upload_filename already exists in Drive

  • folder_id (FileId | None) – If given, the parent folder’s id. If not provided, the parent folder(s) will be searched for and created if necessary.

  • check_hash (bool) – If True, the hash of the local file will be checked against the hash from the uploaded metadata

  • timeout_total (float | None)

  • timeout_chunk (float | None)

Returns:

The metadata for the uploaded file. If check_exists is True and the file exists in Drive, None is returned.

Raises:

UploadError – If check_hash is True and the content hashes do not match

Return type:

DriveFileMetaFull | None

async update_existing_file(file_id: FileId, local_file: Path, check_hash: bool = True) DriveFileMetaFull[source]

Upload a revision for an file that already exists in Drive

Parameters:
  • file_id (FileId) – The id of the file to update

  • local_file (Path) – The local file to upload

  • check_hash (bool) – If True, the hash of the local file will be checked against the hash from the uploaded metadata

Returns:

The metadata for the uploaded file

Return type:

DriveFileMetaFull

Raises:

UploadError – If check_hash is True and the content hashes do not match

async prebuild_paths(*paths: Path)[source]

Build multiple Drive folders from the given paths

Uses a tree of PathNode objects to search for and create non-existent folders while minimizing Drive API calls.

This can be more efficient during uploads since the create_folder_from_path() method relies on a Lock for concurrency.

Parameters:

paths (Path)

class granicus_archiver.googledrive.client.ClipGoogleClient(root_conf: Config, max_clips: int = 8, scheduler_limit: int = 8)[source]

Bases: GoogleClient

Client to upload items from a model.ClipCollection

Parameters:
max_clips: int

Maximum number of Clips to upload

property upload_dir: Path

Root folder name to upload within Drive (alias for config.GoogleConfig.drive_folder)

get_clip_file_upload_path(clip: Clip, key: Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet']) Path[source]

Get the uploaded filename for a clip asset (relative to upload_dir)

Parameters:
  • clip (Clip)

  • key (Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet'])

Return type:

Path

async upload_data_file() DriveFileMetaFull[source]

Upload the data file to Drive

Return type:

DriveFileMetaFull

async get_clip_file_meta(clip: Clip, key: Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet']) tuple[DriveFileMetaFull | None, bool][source]

Get metadata for the given clip file (if it exists)

The local meta_cache will first be checked. If not found, it will be requested using get_file_meta() and stored in the cache (if successful).

Returns:

  • metadata (types.DriveFileMetaFull, optional): The metadata for the file, or None if it does not exist in Drive

  • is_cached (bool): Whether the metadata was retrieved from the meta_cache

Return type:

(tuple)

Parameters:
  • clip (Clip)

  • key (Literal['agenda', 'minutes', 'audio', 'video', 'chapters', 'agenda_packet'])

async upload_clip(clip: Clip) bool[source]

Upload all assets for the given clip

Parameters:

clip (Clip)

Return type:

bool

async check_clip_needs_upload(clip: Clip) tuple[bool, Clip, set[Path]][source]

Check if the given clip has any assets that need to be uploaded

Parameters:

clip (Clip)

Return type:

tuple[bool, Clip, set[Path]]

async handle_upload_check_jobs() None[source]

Wait for jobs from check_clip_needs_upload() and spawns their upload_clip() jobs

The folders from each incoming job will be created with prebuild_paths() before spawning the upload jobs

Return type:

None

async upload_all(clips: ClipCollection) None[source]

Upload assets for the given clips (up to by max_clips)

Parameters:

clips (ClipCollection)

Return type:

None

async check_meta(clips: ClipCollection, enable_hashes: bool, hash_logfile: Path | None = None) bool[source]

Check metadata for all available files stored on Drive

Parameters:
  • enable_hashes (bool) – If True, the sha1Checksum will be compared with the hash of the local file contents

  • hash_logfile (Path | None) – Optional file to store / load the local file hashes. This can greatly reduce the time required since most files are rather large.

  • clips (ClipCollection)

Return type:

bool

Note

If using hash_logfile, it is recommended to delete it periodically, especially if there have been any major changes to the local files.

Raises:

UploadError – If enable_hashes is True and the content hashes do not match

Returns:

Whether any changes to the meta_cache were made

Return type:

bool

Parameters:
class granicus_archiver.googledrive.client._LegModelT

TypeVar for a subclass of AbstractLegistarModel

alias of TypeVar(‘_LegModelT’, bound=AbstractLegistarModel)

class granicus_archiver.googledrive.client.AbstractLegistarGoogleClient(root_conf: Config, max_clips: int, legistar_data: _LegModelT | None = None)[source]

Bases: GoogleClient, Generic[_GuidT, _ItemT, _LegModelT], ABC

Parameters:
max_clips: int

Maximum number of items to upload

legistar_data: _LegModelT

A LegistarData instance

async upload_data_file() DriveFileMetaFull[source]

Upload the data file to Drive

Return type:

DriveFileMetaFull

get_file_upload_path(guid: _GuidT, uid: LegistarFileUID) Path[source]

Get the uploaded filename for a legistar asset (relative to upload_dir)

Parameters:
Return type:

Path

get_local_meta(guid: _GuidT, uid: LegistarFileUID) FileMeta | None[source]

Get the local file metadata for the given guid and uid

Parameters:
Return type:

FileMeta | None

async get_remote_meta(guid: _GuidT, uid: LegistarFileUID) tuple[DriveFileMetaFull | None, bool][source]

Get metadata for the filename matching guid and uid (if it exists)

The local meta_cache will first be checked. If not found, it will be requested using get_file_meta() and stored in the cache (if successful).

Returns:

  • metadata (DriveFileMetaFull, optional): The metadata for the file, or None if it does not exist in Drive

  • is_cached (bool): Whether the metadata was retrieved from the meta_cache

Return type:

(tuple)

Parameters:
async upload_legistar_file(guid: _GuidT, uid: LegistarFileUID, filename: Path, upload_filename: Path, folder_id: FileId) bool[source]

Upload a single legistar file

Parameters:
Return type:

bool

async upload_legistar_item(guid: _GuidT)[source]

Upload all files for the legistar item matching the given guid

Parameters:

guid (_GuidT)

async check_item_needs_upload(guid: _GuidT) tuple[bool, _GuidT, set[Path]][source]

Check if the item matching guid has any local files that need to be uploaded

Parameters:

guid (_GuidT)

Return type:

tuple[bool, _GuidT, set[Path]]

async handle_upload_check_jobs() None[source]

Wait for jobs from check_item_needs_upload() and spawns their upload_legistar_item() jobs

The folders from each incoming job will be created with prebuild_paths() before spawning the upload jobs

Return type:

None

async upload_all()[source]

Upload files for all items in legistar_data (up to by max_clips)

async check_meta(enable_hashes: bool) bool[source]

Check metadata for all available files stored on Drive

Parameters:

enable_hashes (bool) – If True, the sha1Checksum will be compared with the hash of the local file contents

Raises:

UploadError – If enable_hashes is True and the content hashes do not match

Returns:

Whether any changes to the meta_cache were made

Return type:

bool

class granicus_archiver.googledrive.client.LegistarGoogleClient(root_conf: Config, max_clips: int, legistar_data: _LegModelT | None = None)[source]

Bases: AbstractLegistarGoogleClient[GUID, DetailPageResult, LegistarData]

Client to upload items from LegistarData

Parameters:
property upload_dir: Path

Root folder name to upload within Drive (alias for config.GoogleConfig.legistar_drive_folder)

class granicus_archiver.googledrive.client.RGuidLegistarGoogleClient(root_conf: Config, max_clips: int, legistar_data: _LegModelT | None = None)[source]

Bases: AbstractLegistarGoogleClient[REAL_GUID, RGuidDetailResult, RGuidLegistarData]

Client to upload items from RGuidLegistarData

Parameters:
property upload_dir: Path

Root folder name to upload within Drive (alias for config.GoogleConfig.rguid_legistar_drive_folder)