Commit e3d46653 authored by kristian.noullet's avatar kristian.noullet

Added manual

Added manual
parent d8276ab5
# Agnos_mini
<h1>Quick Start Guide Steps</h1>
<ol start="0">
<li>Clone Repository</li>
<li>Add wanted Knowledge Graph (KG) as an enum item to <i>structure/config/kg/EnumModelType.java</i>
e.g. MY_KNOWLEDGE_GRAPH("./my_kg/") - henceforth we will denote the chosen KG's root path as $KG$.
<b>Note</b>: This will allow
1) A user/developer to specify particular configurations for a specific KG (e.g. surface forms for mention detection, underlying caching structures etc.)
2) Agnos to create the file tree as required by the system in the defined location, in this case under the execution's current directory in a $KG$ folder.
3) KG isolation in order to avoid unexpected interactions on the user-side.
</li>
<li>Run install/BuildFiletree.java - as the name implies, it simply creates the file tree to make it easier to place wanted required files
</li>
<li>Load KG into an RDF Store by defining the location of your RDF-based KG within <i>install.LauncherSetupTDB:KGpath</i> and running it for your defined KG (in <i>install.LauncherSetupTDB:KG</i>).
Note:
1) If you define an input folder, all including files will be added to the Jena TDB.
2) If you already have an existing Apache Jena(-compatible) RDF Store, simply put it into $KG$/resources/data/datasets/graph.dataset .
4.1. (Semi-OPTIONAL) : Put SPARQL Queries to be executed on loaded KG for surface form extraction into appropriate folders.
If you already have a file containing surface forms and their related resources,
please put it in $KG$/resources/data/links_surfaceForms.txt (the filepath may be changed in <i>structure.config.FilePaths.java:FILE_ENTITY_SURFACEFORM_LINKING</i>.
The line-wise split delimiter may be defined under <i>structure.config.Strings.java:ENTITY_SURFACE_FORM_LINKING_DELIM</i>, where at the resource is in first position and the defined literal in second.
</li>
<li>
</li>
</ol>
<h1>API</h1>
Code for NIF-format-based queries as well as calls through JSON are provided in:
NIFAPIAnnotator and JSONAPIAnnotator respectively.
\ No newline at end of file
# Agnos_mini
</br><h1>Quick Start Guide Steps</h1>
</br><ol start="0">
</br><li>Clone Repository</li>
</br><li>Add wanted Knowledge Graph (KG) as an enum item to <i>structure/config/kg/EnumModelType.java</i>
</br>e.g. MY_KNOWLEDGE_GRAPH("./my_kg/") - henceforth we will denote the chosen KG's root path as $KG$.
</br><b>Note</b>: This will allow
</br> 1) A user/developer to specify particular configurations for a specific KG (e.g. surface forms for mention detection, underlying caching structures etc.)
</br> 2) Agnos to create the file tree as required by the system in the defined location, in this case under the execution's current directory in a $KG$ folder.
</br> 3) KG isolation in order to avoid unexpected interactions on the user-side.
</br></li>
</br><li>Run install/BuildFiletree.java - as the name implies, it simply creates the file tree to make it easier to place wanted required files
</br></li>
</br><li>Load KG into an RDF Store by defining the location of your RDF-based KG within <i>install.LauncherSetupTDB:KGpath</i> and running it for your defined KG (in <i>install.LauncherSetupTDB:KG</i>).
</br>Note:
</br>1) If you define an input folder, all including files will be added to the Jena TDB.
</br>2) If you already have an existing Apache Jena(-compatible) RDF Store, simply put it into $KG$/resources/data/datasets/graph.dataset .
</br>4.1. (Semi-OPTIONAL) : Put SPARQL Queries to be executed on loaded KG for surface form extraction into appropriate folders.
</br>If you already have a file containing surface forms and their related resources,
</br>please put it in $KG$/resources/data/links_surfaceForms.txt (the filepath may be changed in <i>structure.config.FilePaths.java:FILE_ENTITY_SURFACEFORM_LINKING</i>.
</br>The line-wise split delimiter may be defined under <i>structure.config.Strings.java:ENTITY_SURFACE_FORM_LINKING_DELIM</i>, where the resource is in first position and the defined literal in second.
</br>4.2. Define <i>install.LauncherExecuteQueries:KG</i> with the defined KG and run it. The program will extract appropriate surface forms from your defined KG, outputting them appropriately for the system to process.
</br>
</br></li>
</br><li>
</br>Setup complete!
</br>Simple mention detection and candidate generation may now be performed!
</br>As for disambiguation, depending on which scoring scheme one would like to use, a file containing PageRank scores or embeddings may have to be defined.
</br>For RDF PageRank computation, we provide code under <i>install.PageRankComputer</i> which may then be loaded by disambiguation algorithms using a <i>PageRankLoader</i> from the generated $KG$/resources/data/pagerank.nt file.
</br></li>
</br></ol>
</br>
</br><h1>Running Agnos</h1>
</br>Post-configuration, you may run Agnos by executing launcher.LauncherLinking.java
</br>It takes a string input, applies exact case-insensitive mention detection on it, followed by candidate generation and default disambiguation behaviour.
</br>Results are output to the console.
</br>There also exists launcher.LauncherLinkingSample.java - an easily modifiable sample on how the annotation code process looks like.
</br>
</br><h1>API</h1>
</br>Code for NIF-format-based queries as well as calls through JSON are provided in:
</br>NIFAPIAnnotator and JSONAPIAnnotator respectively.
</br>A very basic API front-end page may be downloaded from <a href="https://km.aifb.kit.edu/sites/agnos-demo/">Agnos</a>.
</br>
</br><h1>Mention Detection</h1>
</br>Out-of-the-box Agnos provides users with 2 main mention detection mechanisms:
</br><ul>
</br> <li>linking.mentiondetection.exact.MentionDetectorMap (Exact matching)</li>
</br> <li>linking.mentiondetection.fuzzy.MentionDetectorLSH (Fuzzy matching)</li>
</br></ul>
</br>Former performs mention detection by checking whether a possible input is contained within a passed map instance.
</br>Latter utilizes locality-sensitive hashing techniques (MinHash), allowing detection with a user-defined grade of fuzziness.
</br>Please note that <i>linking.mentiondetection.fuzzy.MentionDetectorLSH</i> requires (surface form) structures to be computed prior to linking in order to allow for highly-scalable performance.
</br>
</br><h2>Custom Mention Detection</h2>
</br>Mention detection standards are enforced through structure.interfaces.MentionDetector
</br>It enforces easy-to-implement detection for the ease-of-processing of the text annotation pipeline.
</br>As such, any custom mention detection technique should simply implement it in order to warrant compliance with other steps.
</br>Therewith, e.g. consolidation of Agnos' mention detection through POS methods is relatively trivial.
</br>
</br><h1>Candidate Generation</h1>
</br>Agnos mainly utilizes a single candidate generation mechanism: dictionary look-up.
</br>It is implemented within <i>linking.candidategeneration.CandidateGeneratorMap</i> and can be used with a defined mapping.
</br>Custom candidate generation may be performed through implementation of the <i>structure.interfaces.CandidateGenerator</i> interface.
</br>
</br><h1>Disambiguation</h1>
</br>Agnos allows for simple extension of its disambiguation repertoire.
</br>Among others, through use of its <i>structure.interfaces.Scorer</i> and <i>structure.interfaces.PostScorer</i> interfaces.
</br>The difference between the two is that <i>structure.interfaces.Scorer</i> is assumed to be a so-called apriori scoring mechanism (meaning single candidate scores are independant of other candidates), whereas <i>structure.interfaces.PostScorer</i> instances attribute different scores to candidate entities, depending on other candidate entities they are detected with, therewith allowing for the notion of "context" to play a role.
</br>An example of a <i>structure.interfaces.Scorer</i> instance would be our PageRankScorer.
</br>For <i>structure.interfaces.PostScorer</i> instances, we provide <i>linking.disambiguation.scorers.GraphWalkEmbeddingScorer.java</i> and <i>VicinityScorerDirectedSparseGraph.java</i>, among others.
</br>Defining which scoring mechanisms may be used for disambiguation is configurable through the defined <i>linking.disambiguation.Disambiguator</i> instance by calling the <i>addScorer(...)</i> and <i>addPostScorer(...)</i> methods, respectively.
</br>How single scorers' scores are combined may be defined within their own implementation which is then applied through our consolidation mechanism <i>linking.disambiguation.ScoreCombines.java</i>.
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment