...
Finding archives and/or participants based on alternative names.
Finding Aksesjons/Tilveksts using the Mottat-fra values.
Finding archives that have a Serie/Mappe with a particular name.
Finding archives and/or their descendants based on their creator values.
Indexing
...
Entities
As Asta7 is a generic system all the entities in the system are treated equally and all of them get indexed in Elasticsearch separately. But this does not work well for most of the cases. Some entities (like Alternativtnavn, Geografy, etc. in ISADG) should not be indexed separately but rather should be a part of the parent entity.
To mitigate this shortcoming a new option has been added in the entity to make it searchable or not.
...
Indexing
...
Related/Inherited System Entities
Often it is needed to search for something based on some related system entity (participant, restriction, and tag). But as each entity resides on its own index in Elasticsearch and join/subquery is not possible it has not been possible so far.
To overcome this shortcoming, some options have been added in the entity to make it possible to index the related and even inherited system entities with each object.
...
Indexing
...
Member/
...
Descendant Entities
As Elasticsearch is a flat document-based database that does not support joining or subqueries the only way to do a search based on member/descendant entities is to index them with the parent. So, every document will have the necessary members/descendants and related system entities indexed with them.
...
Total nesting limit: 10 (Arkiv → Arkivdel → Serie → Serie → Serie → Stykke → Mappe → Mappe → Mappe → Geografi)
Self nesting limit: 2 (A → Aa → Aaa)File nesting limit: 1 (Arkiv → Fil)
Nested member limit: 1000
ES Total field limit: 5000 (Default was 1000)
ES Nested field limit: 500 (Default was 50)
ES Nested object limit: 50_000 (Default was 10_000)
...
Level 1: All fields
Level 2: All fields for leaf entities (Alternativtnavn, Geografy, etc.), otherwise only required fields (Serie, Stykke)
Level below 2: Required fields
Note |
---|
Althogh Although it is possible to change the search settings any time, even after project inilizationinitialization, it will no be in effect immediately. Project search data has to be re-indexed after any settings change. Otherwise the sarch search might be broken or not work as expected. Changing these settings might also break the existing saved searches. |
Indexing Digital Files
It is also possible to index the associated digital files along with the archive units. When an entity with fileName
field(s) is made searchable the associated files get indexed automatically, nothing else needs to be done. With some limits files can also be indexed with member/descendant archive units as well.
File Fields
ID: file id
Name: file name
Type: file mime type
Length: file length in bytes
Timestamp: file upload time
Content: extracted text from the file
Limits
Supported types: TEXT, XML, HTML, PDF, CSV
Max file size for text extraction: 2GB
File nesting limit: 1 (Arkiv → Fil)
Searching
Tree Search
The tree search has been updated with the following changes
...
Based on the selected field type a particular type of query/operator would be selected, although there can be other operators available for that field type as well which can be found in the advanced mode. The basic mode will do an AND query. There are many more options available in the advanced mode.
File Search
As digital files associated with archive units are also indexed along with the archive units, it is possible to use the file metadata (name, type, length, etc.) and/or content for searching archive units.
If any file entities and/or entities with file member entities (depends on file nesting limit) are selected, then the associated file contents will also be considered for the free text search.
Otherwise it is also possible to select the file fields as filter rules as well.
...
Highlighting
When doing free text search matched fragments will be highlighted based on some criteria. If entities are selected then all the top-level text fields of those entities will be highlighted. If file entities are selected then file content matches will be highlighted in a separate section in the expanded view.
Take a look at Search Examples for some example searches.
...
Need to handle orphan members and system relations during sync.
More control over which member to include, like Serie should be included with Arkiv but not with Arkivdel.
Control which field is searchable and/or searchable as a member. This should make the field limitations unnecessary. If not then need to make the field limitations configurable instead of hard-coding.
Include member’s/descendant's related system entities?
Although there are limitations on the number of nested members, they are not applied during syncing at the moment. Need to fix this.
Need to use Asta7 models and properties in Essync instead of duplicating them.
Multi-level nesting makes the search quite complex and might not be needed for all the projects. Should we consider adding support for flat nesting as well?
Multiple entities have fields with the same name and same type, but one of them has the CodeTableRef/FileName feature. The first field will be used.
Multiple child entities with the same name. The first one will be used. Note, that if there are more fields and/or child entities on the later entities those will be not available.
Should file content indexing be configurable?