granicus_archiver.legistar.model¶
- granicus_archiver.legistar.model.is_attachment_uid(uid: LegistarFileUID) bool[source]¶
Returns
Trueif the given uid is an attachment reference- Parameters:
uid (LegistarFileUID)
- Return type:
- granicus_archiver.legistar.model.uid_to_attachment_name(uid: LegistarFileUID) AttachmentName[source]¶
Convert the given
LegistarFileUIDto anAttachmentName- Raises:
TypeError – If the uid is not an attachment reference
- Parameters:
uid (LegistarFileUID)
- Return type:
- granicus_archiver.legistar.model.attachment_name_to_uid(name: AttachmentName) LegistarFileUID[source]¶
Convert the given
AttachmentNameto aLegistarFileUID- Parameters:
name (AttachmentName)
- Return type:
- granicus_archiver.legistar.model.uid_to_file_key(uid: LegistarFileUID) Literal['agenda', 'minutes', 'agenda_packet', 'video'][source]¶
Convert the given
LegistarFileUIDto aLegistarFileKey- Raises:
TypeError – If the uid is not a valid key
- Parameters:
uid (LegistarFileUID)
- Return type:
Literal[‘agenda’, ‘minutes’, ‘agenda_packet’, ‘video’]
- granicus_archiver.legistar.model.file_key_to_uid(key: Literal['agenda', 'minutes', 'agenda_packet', 'video']) LegistarFileUID[source]¶
Convert the given
LegistarFileKeyto aLegistarFileUID- Parameters:
key (Literal['agenda', 'minutes', 'agenda_packet', 'video'])
- Return type:
- class granicus_archiver.legistar.model.FilePathURL(key: KT, filename: Path, url: URL)[source]¶
Bases:
NamedTuple,Generic[KT]- Parameters:
key (KT)
filename (Path)
url (URL)
- key: KT¶
Alias for field number 0
- filename: Path¶
Alias for field number 1
- class granicus_archiver.legistar.model.FilePathURLComplete(key: KT, filename: Path, url: URL, complete: bool)[source]¶
Bases:
NamedTuple,Generic[KT]- key: KT¶
Alias for field number 0
- filename: Path¶
Alias for field number 1
- class granicus_archiver.legistar.model.UpdateResult(changed: bool, link_keys: list[LegistarFileKey], attachment_keys: list[AttachmentName], attributes: dict[str, Any] | None = None)[source]¶
Bases:
NamedTuple- Parameters:
- link_keys: list[Literal['agenda', 'minutes', 'agenda_packet', 'video']]¶
Any URL attributes from
DetailPageLinksthat changed
- attachment_keys: list[AttachmentName]¶
Any keys in
DetailPageLinks.attachmentsthat changed
- class granicus_archiver.legistar.model.AbstractFile(name: KT, filename: Path, metadata: FileMeta, pdf_links_removed: bool = False)[source]¶
Bases:
Serializable,ABC,Generic[KT]Abstract base class for file information
- name: KT¶
File key
- filename: Path¶
Local file path
- remove_pdf_links() bool[source]¶
Strip embedded links from the pdf file
If the file is not a pdf or if
pdf_links_removedis already set toTrue, no alteration will be performed.This removes URL from the hardcoded links only and does not reformat the text. It will still appear as blue with an underline, but will no longer be clickable or have a URL action.
The resulting file will have the same path and the
metadatawill be updated with the new file size (types.FileMeta.content_length).The
pdf_links_removedflag will then be set toTrue- Return type:
- class granicus_archiver.legistar.model.LegistarFile(name: KT, filename: Path, metadata: FileMeta, pdf_links_removed: bool = False)[source]¶
Bases:
AbstractFile[Literal[‘agenda’, ‘minutes’, ‘agenda_packet’, ‘video’]]Information for a downloaded file within
LegistarFiles.filesusingLegistarFileKeyfor thenameattribute
- class granicus_archiver.legistar.model.AttachmentFile(name: KT, filename: Path, metadata: FileMeta, pdf_links_removed: bool = False)[source]¶
Bases:
AbstractFile[AttachmentName]Information for a downloaded attachment within
LegistarFiles.attachmentsusingAttachmentNamefor thenameattribute
- class granicus_archiver.legistar.model.LegistarFiles(guid: ~granicus_archiver.legistar.types.GUID, files: dict[~typing.Literal['agenda', 'minutes', 'agenda_packet', 'video'], ~granicus_archiver.legistar.model.LegistarFile] = <factory>, attachments: dict[~granicus_archiver.legistar.types.AttachmentName, ~granicus_archiver.legistar.model.AttachmentFile | None] = <factory>)[source]¶
Bases:
SerializableCollection of files for a
DetailPageResult- Parameters:
guid (GUID)
files (dict[Literal['agenda', 'minutes', 'agenda_packet', 'video'], ~granicus_archiver.legistar.model.LegistarFile])
attachments (dict[AttachmentName, AttachmentFile | None])
- guid: GUID¶
The guid of the
DetailPageResult
- files: dict[Literal['agenda', 'minutes', 'agenda_packet', 'video'], LegistarFile]¶
Downloaded
LegistarFileinformation
- attachments: dict[AttachmentName, AttachmentFile | None]¶
Additional file attachments as
AttachmentFileobjects
- remove_all_pdf_links() bool[source]¶
Call
remove_pdf_links()on all files and attachments- Return type:
- get_file_uid(key: Literal['agenda', 'minutes', 'agenda_packet', 'video']) LegistarFileUID[source]¶
Get a unique key for the given
LegistarFileKey- Parameters:
key (Literal['agenda', 'minutes', 'agenda_packet', 'video'])
- Return type:
- get_attachment_uid(name: AttachmentName) LegistarFileUID[source]¶
Get a unique key for the given
AttachmentName- Parameters:
name (AttachmentName)
- Return type:
- resolve_file_uid(uid: LegistarFileUID) tuple[Literal['agenda', 'minutes', 'agenda_packet', 'video'] | AttachmentName, bool][source]¶
Resolve the
LegistarFileUIDto its original form- Returns:
key: A
LegistarFileKeyorAttachmentNameis_attachment:
Trueif the key represents anattachment
- Return type:
(tuple)
- Parameters:
uid (LegistarFileUID)
- resolve_uid(uid: LegistarFileUID) LegistarFile | AttachmentFile | None[source]¶
Get the
LegistarFileorAttachmentFilereferenced by the given uid (orNoneif it does not exist)- Parameters:
uid (LegistarFileUID)
- Return type:
LegistarFile | AttachmentFile | None
- ensure_local_hashes(legistar_data: LegistarData, check_existing: bool = False) bool[source]¶
Ensure that all local files have an
sha1hash stored in theirmetadata- Parameters:
check_existing (bool) – If
True, the hash of the local file will be checked against the stored hashlegistar_data (LegistarData)
- Returns:
Trueif any hashes were generated or updated- Return type:
- class granicus_archiver.legistar.model.DetailPageLinks(agenda: ~yarl.URL | None, minutes: ~yarl.URL | None, agenda_packet: ~yarl.URL | None, video: ~yarl.URL | None, attachments: dict[~granicus_archiver.legistar.types.AttachmentName, ~yarl.URL] = <factory>)[source]¶
Bases:
SerializableLinks gathered from a meeting detail page
- Parameters:
- attachments: dict[AttachmentName, URL]¶
Attachment URLs
- get_clip_id_from_video() CLIP_ID | None[source]¶
Parse the
clip_idfrom thevideourl (if it exists)- Return type:
CLIP_ID | None
- iter_uids() Iterator[tuple[LegistarFileUID, URL | None]][source]¶
Iterate over all files and attachments as
LegistarFileUID,URLtuples- Return type:
Iterator[tuple[LegistarFileUID, URL | None]]
- update(other: Self) UpdateResult[source]¶
Update self with any changes from other
- Parameters:
other (Self)
- Return type:
- class granicus_archiver.legistar.model.DetailPageResult(page_url: URL, feed_guid: GUID, location: str, links: DetailPageLinks, agenda_status: Literal['Final', 'Final-Addendum', 'Draft', 'Not Viewable by the Public'], minutes_status: Literal['Final', 'Final-Addendum', 'Draft', 'Not Viewable by the Public'], feed_item: FeedItem, last_fake_stupid_guid: GUID | None = None)[source]¶
Bases:
SerializableData gathered from a meeting detail (
/MeetingDetail.aspx) page- Parameters:
page_url (URL)
feed_guid (GUID)
location (str)
links (DetailPageLinks)
agenda_status (Literal['Final', 'Final-Addendum', 'Draft', 'Not Viewable by the Public'])
minutes_status (Literal['Final', 'Final-Addendum', 'Draft', 'Not Viewable by the Public'])
feed_item (FeedItem)
last_fake_stupid_guid (GUID | None)
- page_url: URL¶
The detail page url (from
rss_parser.FeedItem.link)
- links: DetailPageLinks¶
URL data
- agenda_status: Literal['Final', 'Final-Addendum', 'Draft', 'Not Viewable by the Public']¶
Agenda status
- minutes_status: Literal['Final', 'Final-Addendum', 'Draft', 'Not Viewable by the Public']¶
Minutes status
- last_fake_stupid_guid: GUID | None = None¶
The last known guid with the stupid timestamp part
This is to avoid having to reparse the detail pages since Legistar apparently likes to change ALL the guids at the beginning of every year. SMH this is ridiculous!
- property clip_id: CLIP_ID | None¶
The
clip_idparsed fromDetailPageLinks.get_clip_id_from_video()
- property agenda_final: bool¶
Trueifagenda_statusis final
- property minutes_final: bool¶
Trueifminutes_statusis final
- property is_addendum: bool¶
Trueifagenda_statusorminutes_statusis"Final-Addendum"
- property item_status: Literal['final', 'addendum', 'draft', 'hidden']¶
Overall item status
One of:
"final""addendum""draft""hidden"
- property is_draft: bool¶
Trueifagenda_statusorminutes_statusare set to “Draft”
Whether the item is hidden (if
agenda_statusis “Not Viewable by the Public”)
- property is_future: bool¶
Alias for
rss_parser.FeedItem.is_future
- property is_in_past: bool¶
Alias for
rss_parser.FeedItem.is_in_past
- property real_guid: REAL_GUID¶
Alias for
rss_parser.FeedItem.real_guid
- get_unique_folder() Path[source]¶
Get a local path to store files for this item
The structure will be:
<category>/<year>/<datetime>_<title>_<status>Where
<category><year>Is the 4-digit year of the
meeting_date<datetime>Is the
meeting_date(formatted as"%Y%m%d-%H%M")<title>Is the
title<status>Is the
item_status
This combination was chosen to ensure uniqueness.
- Return type:
Path
- classmethod from_html(html_str: str | bytes, feed_item: FeedItem) Self[source]¶
Create an instance from the raw html from
page_url
- update(other: Self) UpdateResult[source]¶
Update self with changed attributes in other
- Parameters:
other (Self)
- Return type:
- class granicus_archiver.legistar.model.AbstractLegistarModel(root_dir: 'Path')[source]¶
Bases:
Serializable,Generic[_GuidT,_ItemT]- Parameters:
root_dir (Path)
- filter_by_category(*categories: Category, items: dict[_GuidT, _ItemT] | None = None) dict[_GuidT, _ItemT][source]¶
Filter items by
category
- filter_by_dt_range(start_dt: datetime | None, end_dt: datetime | None, items: dict[_GuidT, _ItemT] | None = None) dict[_GuidT, _ItemT][source]¶
Filter items by their
meeting_date- Parameters:
- Return type:
Note
If start_dt or end_dt are not timezone-aware the (no
tzinfo), the configuredlocal timezoneis assumed.
- class granicus_archiver.legistar.model.LegistarData(root_dir: ~pathlib._local.Path, matched_guids: dict[~granicus_archiver.clips.model.CLIP_ID, ~granicus_archiver.legistar.types.GUID] = <factory>, matched_real_guids: dict[~granicus_archiver.clips.model.CLIP_ID, ~granicus_archiver.legistar.types.REAL_GUID] = <factory>, detail_results: dict[~granicus_archiver.legistar.types.GUID, ~granicus_archiver.legistar.model.DetailPageResult] = <factory>, items_by_clip_id: dict[~granicus_archiver.clips.model.CLIP_ID, ~granicus_archiver.legistar.model.DetailPageResult] = <factory>, files: dict[~granicus_archiver.legistar.types.GUID, ~granicus_archiver.legistar.model.LegistarFiles] = <factory>, clip_id_overrides: dict[~granicus_archiver.legistar.types.REAL_GUID, ~granicus_archiver.clips.model.CLIP_ID | ~typing.Literal[_DoesNotExistEnum.DoesNotExist]] = <factory>)[source]¶
Bases:
AbstractLegistarModel[GUID,DetailPageResult]Container for data gathered from Legistar
- Parameters:
root_dir (Path)
detail_results (dict[GUID, DetailPageResult])
items_by_clip_id (dict[CLIP_ID, DetailPageResult])
files (dict[GUID, LegistarFiles])
clip_id_overrides (dict[REAL_GUID, CLIP_ID | Literal[_DoesNotExistEnum.DoesNotExist]])
- root_dir: Path¶
Root filesystem path for downloading assets
- detail_results: dict[GUID, DetailPageResult]¶
Mapping of parsed
DetailPageResultitems with theirfeed_guidas keys
- items_by_clip_id: dict[CLIP_ID, DetailPageResult]¶
Mapping of items in
detail_resultswith a validclip_id
- files: dict[GUID, LegistarFiles]¶
Mapping of downloaded
LegistarFileswith theirguidas keys
- clip_id_overrides: dict[REAL_GUID, CLIP_ID | Literal[_DoesNotExistEnum.DoesNotExist]]¶
Mapping of items manually-linked to
Clips
- get_clip_id_for_guid(guid: GUID, use_overrides: bool = True) CLIP_ID | None | Literal[_DoesNotExistEnum.DoesNotExist][source]¶
Get the clip
idlinked to the given guid- Parameters:
use_overrides (bool) – Whether to use items in
clip_id_overrides(default isTrue)
- Return type:
Returns one of:
- get_future_items() Iterator[DetailPageResult][source]¶
Iterate over any items in
detail_resultsthat are in thefuture- Return type:
- ensure_no_future_items() None[source]¶
Ensure there are no items in
detail_resultsthat are in thefuture- Return type:
None
- ensure_unique_item_folders() None[source]¶
Unsure paths generated by
DetailPageResult.get_folder_for_item()are unique among all items indetail_results- Return type:
None
- is_clip_id_available(clip_id: CLIP_ID) bool[source]¶
Check whether the given clip id is linked to an item (returns
Trueif there is no link)
- is_guid_matched(guid: GUID) bool[source]¶
Check whether the item matching guid has a
Clipassociated with it
- get_by_real_guid(real_guid: REAL_GUID) DetailPageResult | None[source]¶
Get the
DetailPageResultmatching the givenreal_guidIf no match is found,
Noneis returned.- Parameters:
real_guid (REAL_GUID)
- Return type:
DetailPageResult | None
- find_match_for_clip_id(clip_id: CLIP_ID) DetailPageResult | None | Literal[_DoesNotExistEnum.DoesNotExist][source]¶
Find a
DetailPageResultmatch for the given clip_id- Parameters:
clip_id (CLIP_ID)
- Return type:
DetailPageResult | None | Literal[_DoesNotExistEnum.DoesNotExist]
- add_guid_match(clip_id: CLIP_ID, guid: GUID) None[source]¶
Add a
Clip.id -> FeedItemmatch tomatched_guidsThis may seem redunant considering the
find_match_for_clip_id()method, but is intended for adding matches for items without avideourl to parse.
- add_clip_match_override(real_guid: REAL_GUID, clip_id: CLIP_ID | None | Literal[_DoesNotExistEnum.DoesNotExist]) None[source]¶
Add a manual override for the given
real_guid- Parameters:
- Return type:
None
- add_detail_result(item: DetailPageResult) None[source]¶
Add a parsed
DetailPageResulttodetail_results- Parameters:
item (DetailPageResult)
- Return type:
None
- iter_guid_matches() Iterator[tuple[CLIP_ID, DetailPageResult]][source]¶
Iterate over items added by the
add_guid_match(),add_guid_match()andadd_clip_match_override()methodsResults are tuples of
CLIP_IDandDetailPageResult- Return type:
- get_folder_for_item(item: GUID | DetailPageResult) Path[source]¶
Get a local path to store files for a
DetailPageResultSee
DetailPageResult.get_folder_for_item()for more details.- Parameters:
item (GUID | DetailPageResult)
- Return type:
Path
- get_or_create_files(guid: GUID) LegistarFiles[source]¶
Get a
LegistarFilesinstance for guid, creating one if it does not exist- Parameters:
guid (GUID)
- Return type:
- get_file_uid(guid: GUID, key: Literal['agenda', 'minutes', 'agenda_packet', 'video']) LegistarFileUID[source]¶
Get a unique key for the given
GUIDandLegistarFileKey- Parameters:
- Return type:
- get_attachment_uid(guid: GUID, name: AttachmentName) LegistarFileUID[source]¶
Get a unique key for the given
GUIDandAttachmentName- Parameters:
guid (GUID)
name (AttachmentName)
- Return type:
- get_path_for_uid(guid: GUID, uid: LegistarFileUID) tuple[Path, FileMeta | None][source]¶
Get filesystem path for the
GUIDandLegistarFileUID- Returns:
- Return type:
(tuple)
- Parameters:
guid (GUID)
uid (LegistarFileUID)
- iter_files_for_upload(guid: GUID) Iterator[tuple[LegistarFileUID, Path, FileMeta, bool]][source]¶
Iterate over files present locally for the given
GUID- Yields:
uid (
LegistarFileUID): The uid for the file typefilename (
Path): The local file pathmeta (
FileMeta): Local meta data for the fileis_attachment (
bool):Trueif the uid refers to anattachment,Falseotherwise
- Parameters:
guid (GUID)
- Return type:
Iterator[tuple[LegistarFileUID, Path, FileMeta, bool]]
- get_file_path(guid: GUID, key: Literal['agenda', 'minutes', 'agenda_packet', 'video']) Path[source]¶
Get the local path for the
LegistarFilesobject matching the givenguidandfile key
- set_uid_complete(guid: GUID, uid: LegistarFileUID, meta: FileMeta, pdf_links_removed: bool = False) LegistarFile | AttachmentFile[source]¶
Set the file or attachment for the given parameters as “complete” (after successful download)
This calls either
set_file_complete()orset_attachment_complete()depending on the uid.- Parameters:
guid (GUID) – The
guidof theLegistarFilesobjectuid (LegistarFileUID) – The
file uidmeta (FileMeta) – Metadata from the download
pdf_links_removed (bool) – Value to set for the file’s
pdf_links_removedattribute
- Return type:
- set_file_complete(guid: GUID, key: Literal['agenda', 'minutes', 'agenda_packet', 'video'], meta: FileMeta, pdf_links_removed: bool = False) LegistarFile[source]¶
Set the file for the given parameters as “complete” (after successful download)
- Parameters:
guid (GUID) – The
guidof theLegistarFilesobjectkey (Literal['agenda', 'minutes', 'agenda_packet', 'video']) – The
file keymeta (FileMeta) – Metadata from the download
pdf_links_removed (bool) – Value to set for the file’s
pdf_links_removedattribute
- Return type:
- get_attachment_path(guid: GUID, name: AttachmentName) Path[source]¶
Get the local path for an item in
LegistarFiles.attachments- Parameters:
guid (GUID)
name (AttachmentName)
- Return type:
Path
- set_attachment_complete(guid: GUID, name: AttachmentName, meta: FileMeta, pdf_links_removed: bool = False) AttachmentFile[source]¶
Set an item in
LegistarFiles.attachmentsas “complete” (after successful download)- Parameters:
guid (GUID) – The
guidof theLegistarFilesobjectname (AttachmentName) – The
AttachmentName(key withinLegistarFiles.attachments)meta (FileMeta) – Metadata from the download
pdf_links_removed (bool) – Value to set for the file’s
pdf_links_removedattribute
- Return type:
- iter_url_paths_uid(guid: GUID) Iterator[FilePathURLComplete[LegistarFileUID]][source]¶
Iterate over all files and attachments for the given guid as
FilePathURLCompletetuples using theuidas thekeyparameter- Parameters:
guid (GUID)
- Return type:
- iter_attachments(guid: GUID) Iterator[FilePathURLComplete[AttachmentName]][source]¶
Iterate over any
LegistarFiles.attachmentsfor the given guid (asFilePathURLCompletetuples)- Parameters:
guid (GUID)
- Return type:
- iter_incomplete_attachments(guid: GUID) Iterator[FilePathURL[AttachmentName]][source]¶
Iterate over
LegistarFiles.attachmentswhich have not been downloaded (asFilePathURLtuples)- Parameters:
guid (GUID)
- Return type:
- iter_url_paths(guid: GUID) Iterator[FilePathURLComplete[Literal['agenda', 'minutes', 'agenda_packet', 'video']]][source]¶
Iterate over items in a
LegistarFilesinstance (asFilePathURLCompletetuples)- Parameters:
guid (GUID)
- Return type:
Iterator[FilePathURLComplete[Literal[‘agenda’, ‘minutes’, ‘agenda_packet’, ‘video’]]]
- iter_incomplete_url_paths(guid: GUID) Iterator[FilePathURL[Literal['agenda', 'minutes', 'agenda_packet', 'video']]][source]¶
Iterate over items in a
LegistarFilesinstance which have not been downloaded (asFilePathURLtuples)- Parameters:
guid (GUID)
- Return type:
Iterator[FilePathURL[Literal[‘agenda’, ‘minutes’, ‘agenda_packet’, ‘video’]]]
- iter_existing_url_paths(guid: GUID) Iterator[FilePathURL[Literal['agenda', 'minutes', 'agenda_packet', 'video']]][source]¶
Iterate over items in a
LegistarFilesinstance which have been successfully downloaded (asFilePathURLtuples)- Parameters:
guid (GUID)
- Return type:
Iterator[FilePathURL[Literal[‘agenda’, ‘minutes’, ‘agenda_packet’, ‘video’]]]