attelo.decoding package¶
Decoding in attelo consists in building discourse graphs from a set of attachment/labelling predictions.
Submodules¶
attelo.decoding.astar module¶
module for building discourse graphs from probability distribution and respecting some constraints, using Astar heuristics based search and variants (beam, b&b)
TODO: unlabelled evaluation seems to bug on RF decoding (relation is of type orange.value -> go see in decoding.py)
-
class
attelo.decoding.astar.
AstarArgs
¶ Bases:
attelo.decoding.astar.AstarArgs
Configuration options for the A* decoder
Parameters: - heuristics (Heuristic) – an a* heuristic funtion (estimate the cost of what has not been explored yet)
- use_prob (bool) – indicates if previous scores are probabilities in [0,1] (to be mapped to -log) or arbitrary scores (untouched)
- beam (int or None) – size of the beam-search (if None: vanilla astar)
- rfc (RfcConstraint) – what sort of right frontier constraint to apply
-
class
attelo.decoding.astar.
AstarDecoder
(astar_args)¶ Bases:
attelo.decoding.interface.Decoder
wrapper for astar decoder to be used by processing pipeline returns the best structure
-
decode
(dpack)¶
-
-
class
attelo.decoding.astar.
DiscData
(parent=None, accessible=None, tolink=None)¶ Bases:
object
Natural reading order decoding: incremental building of tree in order of text (one edu at a time)
Basic discourse data for a state: chosen links between edus at that stage + right-frontier state. To save space, only new links are stored. the complete solution will be built with backpointers via the parent field
RF: right frontier, = admissible attachment point of current discourse unit
Parameters: - parent – parent state (previous decision)
- link ((string, string, string)) – current decision (a triplet: target edu, source edu, relation)
- tolink ([string]) – remaining unattached discourse units
-
accessible
()¶ return the list of edus that are on the right frontier
Return type: [string]
-
final
()¶ return True if there are no more links to be made
-
last_link
()¶ return the link that was made to get to this state, if any
-
link
(to_edu, from_edu, relation, rfc=<RfcConstraint.full: 2>)¶ rfc = “full”: use the distinction coord/subord rfc = “simple”: consider everything as subord rfc = “none” no constraint on attachment
-
tobedone
()¶ return the list of edus to be linked
Return type: [string]
-
class
attelo.decoding.astar.
DiscourseBeamSearch
(heuristic=<function <lambda>>, shared=None, queue_size=10)¶ Bases:
attelo.decoding.astar.DiscourseSearch
,attelo.optimisation.astar.BeamSearch
-
class
attelo.decoding.astar.
DiscourseSearch
(heuristic=<function <lambda>>, shared=None, queue_size=None)¶ Bases:
attelo.optimisation.astar.Search
subtype of astar search for discourse: should be the same for every astar decoder, provided the discourse state is a subclass of DiscourseState
recover solution should be as is, provided a state has at least the following info: - parent: parent state - _link: the actual prediction made at this stage (1 state = 1 relation = (du1, du2, relation)
-
new_state
(data)¶
-
recover_solution
(endstate)¶ follow back pointers to collect list of chosen relations on edus.
-
-
class
attelo.decoding.astar.
DiscourseState
(data, heuristics, shared)¶ Bases:
attelo.optimisation.astar.State
Natural reading order decoding: incremental building of tree in order of text (one edu at a time)
instance of discourse graph with probability for each attachement+relation on a subset of edges.
implements the State interface to be used by Search
strategy: at each step of exploration choose a relation between two edus related by probability distribution, reading order a.k.a NRO “natural reading order”, cf Bramsen et al., 2006. in temporal processing.
‘data’ is set of instantiated relations (typically nothing at the beginning, but could be started with a few chosen relations)
‘shared’ points to shared data between states (here proba distribution between considered pairs of edus at least, but also can include precomputed info for heuristics)
-
h_average
()¶ return the average probability possible when n nodes still need to be attached assuming the best overall prob in the distrib
-
h_best
()¶ return the best probability possible when n nodes still need to be attached assuming the best overall prob in the distrib
-
h_best_overall
()¶ return the best probability possible when n nodes still need to be attached assuming the best overall prob in the distrib
-
h_zero
()¶ always 0
-
is_solution
()¶
-
next_states
()¶ must return a state and a cost TODO: adapt to disc parse, according to choice made for data -> especially update to RFC
-
proba
(edu_pair)¶ return the label and probability that an edu pair are attached, or (“no”, None) if we don’t have a prediction for the pair
Return type: (string, float or None)
information shared between states
-
strategy
()¶ full or not, if the RFC is applied to labelled edu pairs
-
-
class
attelo.decoding.astar.
Heuristic
¶ Bases:
enum.Enum
Heuristic cost to guide A* search with
- zero: see DiscourseState.h_zero
- max: see DiscourseState.h_best_overall
- best: see DiscourseState.h_best
- average: see DiscourseState.h_average
-
average
= <Heuristic.average: 3>¶
-
best
= <Heuristic.best: 2>¶
-
max
= <Heuristic.max: 1>¶
-
zero
= <Heuristic.zero: 0>¶
-
class
attelo.decoding.astar.
RfcConstraint
¶ Bases:
enum.Enum
What sort of right frontier constraint to apply during decoding:
- simple: every relation is treated as subordinating
- full: (falls back to simple in case of unlabelled prediction)
-
full
= <RfcConstraint.full: 2>¶
-
none
= <RfcConstraint.none: 3>¶
-
simple
= <RfcConstraint.simple: 1>¶
-
class
attelo.decoding.astar.
TwoStageNRO
¶ Bases:
attelo.decoding.astar.DiscourseState
similar as above with different handling of inter-sentence and intra-sentence relations
-
next_states
()¶ must return a state and a cost
-
same_sentence
(edu1, edu2)¶ not implemented: will always return False TODO: this should go in preprocessing before launching astar ?? would it be easier to have access to all edu pair features ?? (certainly for that one)
-
-
class
attelo.decoding.astar.
TwoStageNROData
(parent=None, accessible=None, tolink=None)¶ Bases:
attelo.decoding.astar.DiscData
similar as above with different handling of inter-sentence and intra-sentence relations
accessible is list of starting edus (only one for now)
-
accessible
()¶ wip:
-
link
(to_edu, from_edu, relation)¶ WIP
-
update_mode
()¶ switch between intra/inter-sentential parsing mode
-
-
attelo.decoding.astar.
preprocess_heuristics
(cands)¶ - precompute a set of useful information used by heuristics, such as
- best probability
- table of best probability when attaching a node, indexed on that node
format of cands is format given in main decoder: a list of (arg1,arg2,proba,best_relation)
attelo.decoding.baseline module¶
Baseline decoders
-
class
attelo.decoding.baseline.
LastBaseline
¶ Bases:
attelo.decoding.interface.Decoder
attach to last, always
-
decode
(dpack, nonfixed_pairs=None)¶
-
-
class
attelo.decoding.baseline.
LocalBaseline
(threshold, use_prob=True)¶ Bases:
attelo.decoding.interface.Decoder
just attach locally if prob is > threshold
-
decode
(dpack, nonfixed_pairs=None)¶
-
attelo.decoding.greedy module¶
Implementation of the locally greedy approach similar with DuVerle & Predinger (2009, 2010) (but adapted for SDRT, where the notion of adjacency includes embedded segments)
July 2012
@author: stergos
-
class
attelo.decoding.greedy.
LocallyGreedy
¶ Bases:
attelo.decoding.interface.Decoder
The locally greedy decoder
-
decode
(dpack)¶
-
-
class
attelo.decoding.greedy.
LocallyGreedyState
(instances)¶ Bases:
object
the mutable parts of the locally greedy algorithm
-
decode
()¶ Run the decoder
:rtype [(EDU, EDU, string)]
-
-
attelo.decoding.greedy.
are_strictly_adjacent
(one, two, edus)¶ returns True in the following cases
[one] [two] [two] [one]
in the rest of the cases (when there is an edu between one and two) it returns False
-
attelo.decoding.greedy.
get_neighbours
(edus)¶ Return a mapping from each EDU to its neighbours
Return type: Dict Edu [Edu]
-
attelo.decoding.greedy.
is_embedded
(one, two)¶ returns True when one is embedded in two, that is
[two ... [one] ... ]
returns False in all other cases
attelo.decoding.interface module¶
Common interface that all decoders must implement
-
class
attelo.decoding.interface.
Decoder
¶ Bases:
attelo.parser.interface.Parser
A decoder is a function which given a probability distribution (see below) and some control parameters, returns a sequence of predictions.
Most decoders only really return one prediction in practice, but some, like the A* decoder might have able to return a ranked sequence of the “N best” predictions it can find
We have a few informal types to consider here:
- a link ((string, string, string)) represents a link between a pair of EDUs. The first two items are their identifiers, and the third is the link label
- a candidate link (or candidate, to be short, (EDU, EDU, float, string)) is a link with a probability attached
- a prediction is morally a set (in practice a list) of links
- a distribution is morally a set of proposed links
Note that a decoder could also be seen/used as a sort of crude parser (with a fit function is a no-op). You’ll likely want to prefix it with a parser that extracts weights from datapacks lest you work with the somewhat unformative 1.0s everywhere.
-
decode
(dpack)¶ Return the N-best predictions in the form of a datapack per prediction.
-
fit
(dpacks, targets, nonfixed_pairs=None, cache=None)¶
-
transform
(dpack, nonfixed_pairs=None)¶
attelo.decoding.local module¶
Local decoders make decisions for each edge independently.
-
class
attelo.decoding.local.
AsManyDecoder
¶ Bases:
attelo.decoding.interface.Decoder
Greedy decoder that picks as many edges as there are real EDUs.
The output structure is a graph that has the same number of edges as a spanning tree over the EDUs. It can be non-connex, contain cycles and re-entrancies.
-
decode
(dpack)¶ Return the set of top N edges
-
-
class
attelo.decoding.local.
BestIncomingDecoder
¶ Bases:
attelo.decoding.interface.Decoder
Greedy decoder that picks the best incoming edge for each EDU.
The output structure is a graph that contains exactly one incoming edge for each EDU, thus it has the same number of edges as a spanning tree over the EDUs. It can be non-connex or contain cycles, but no re-entrancy.
-
decode
(dpack)¶ Return the best incoming edge for each EDU
-
attelo.decoding.mst module¶
Created on Jun 27, 2012
@author: stergos, jrmyp
-
class
attelo.decoding.mst.
MsdagDecoder
(root_strategy, use_prob=True)¶ Bases:
attelo.decoding.mst.MstDecoder
Attach according to MSDAG (subgraph of original)
-
decode
(dpack, nonfixed_pairs=None)¶
-
-
class
attelo.decoding.mst.
MstDecoder
(root_strategy, use_prob=True)¶ Bases:
attelo.decoding.interface.Decoder
Attach in such a way that the resulting subgraph is a maximum spanning tree of the original
-
decode
(dpack, nonfixed_pairs=None)¶
-
-
class
attelo.decoding.mst.
MstRootStrategy
¶ Bases:
attelo.util.ArgparserEnum
How we declare the MST root node
-
fake_root
= <MstRootStrategy.fake_root: 1>¶
-
leftmost
= <MstRootStrategy.leftmost: 2>¶
-
attelo.decoding.util module¶
Utility classes functions shared by decoders
-
exception
attelo.decoding.util.
DecoderException
¶ Bases:
exceptions.Exception
Exceptions that arise during the decoding process
-
attelo.decoding.util.
cap_score
(score)¶ Cap a real-valued score between MIN_SCORE and MAX_SCORE.
The current default values for MIN_SCORE and MAX_SCORE follow the requirements from the decoders: * The MST decoder uses the depparse package whose MST implementation has a hardcoded minimum score of -1e100 ; Feeding it lower weights crashes the algorithm. Combined scores can’t reach the limit unless we have more than 1e10 nodes. * The Eisner decoder internally uses float64 scores.
Parameters: score (float) – Original score. Returns: bounded_score – Score bounded to [MIN_SCORE, MAX_SCORE]. Return type: float
-
attelo.decoding.util.
convert_prediction
(dpack, triples)¶ Populate a datapack prediction array from a list of triples
Parameters: prediction ([(string, string, string)]) – List of EDU id, EDU id, label triples Returns: dpack – A copy of the original DataPack with predictions set Return type: DataPack
-
attelo.decoding.util.
get_prob_map
(instances)¶ Reformat a probability distribution as a dictionary from edu id pairs to a (relation, probability) tuples
:rtype dict (string, string) (string, float)
-
attelo.decoding.util.
get_sorted_edus
(instances)¶ Return a list of EDUs, using the following as sort key in order of
- starting position (earliest edu first)
- ending position (narrowest edu first)
Note that there may be EDU pairs with the same spans (particularly in case of annotation error). In case of ties, the order should be considered arbitrary
-
attelo.decoding.util.
prediction_to_triples
(dpack)¶ Returns: triples – List of EDU id, EDU id, label triples omitting the unrelated triples
Return type: prediction: [(string, string, string)]
-
attelo.decoding.util.
simple_candidates
(dpack)¶ Translate the links into a list of (EDU, EDU, float, string) quadruplets representing the attachment probability and the the best label for each EDU pair. This is often good enough for simplistic decoders
attelo.decoding.window module¶
A “pruning” decoder that pre-processes candidate edges and prunes them away if they are separated by more than a certain number of EDUs
-
class
attelo.decoding.window.
WindowPruner
(window)¶ Bases:
attelo.decoding.interface.Decoder
Notes
We assume that the datapack includes every EDU in its grouping.
If there are any gaps, the window will be a bit messed up
As decoders are parsers like any other, if you just want to apply this as preprocessing to a decoder, you could construct a mini pipeline consisting of this plus the decoder. Alternatively, if you already have a larger pipeline of which the decoder is already part, you can just insert this before the decoder.
-
decode
(dpack)¶
-