attelo.decoding package¶

Decoding in attelo consists in building discourse graphs from a set of attachment/labelling predictions.

Submodules¶

attelo.decoding.astar module¶

module for building discourse graphs from probability distribution and respecting some constraints, using Astar heuristics based search and variants (beam, b&b)

TODO: unlabelled evaluation seems to bug on RF decoding (relation is of type orange.value -> go see in decoding.py)

class attelo.decoding.astar.AstarArgs¶

Bases: attelo.decoding.astar.AstarArgs

Configuration options for the A* decoder

Parameters:

heuristics (Heuristic) – an a* heuristic funtion (estimate the cost of what has not been explored yet)
use_prob (bool) – indicates if previous scores are probabilities in [0,1] (to be mapped to -log) or arbitrary scores (untouched)
beam (int or None) – size of the beam-search (if None: vanilla astar)
rfc (RfcConstraint) – what sort of right frontier constraint to apply

class attelo.decoding.astar.AstarDecoder(astar_args)¶

Bases: attelo.decoding.interface.Decoder

wrapper for astar decoder to be used by processing pipeline returns the best structure

decode(dpack)¶

class attelo.decoding.astar.DiscData(parent=None, accessible=None, tolink=None)¶

Bases: object

Natural reading order decoding: incremental building of tree in order of text (one edu at a time)

Basic discourse data for a state: chosen links between edus at that stage + right-frontier state. To save space, only new links are stored. the complete solution will be built with backpointers via the parent field

RF: right frontier, = admissible attachment point of current discourse unit

Parameters:	parent – parent state (previous decision) link ((string, string, string)) – current decision (a triplet: target edu, source edu, relation) tolink ([string]) – remaining unattached discourse units

accessible()¶

return the list of edus that are on the right frontier

Return type:	[string]

final()¶: return True if there are no more links to be made

last_link()¶: return the link that was made to get to this state, if any

link(to_edu, from_edu, relation, rfc=<RfcConstraint.full: 2>)¶: rfc = “full”: use the distinction coord/subord rfc = “simple”: consider everything as subord rfc = “none” no constraint on attachment

tobedone()¶

return the list of edus to be linked

Return type:	[string]

class attelo.decoding.astar.DiscourseBeamSearch(heuristic=<function <lambda>>, shared=None, queue_size=10)¶: Bases: attelo.decoding.astar.DiscourseSearch, attelo.optimisation.astar.BeamSearch

class attelo.decoding.astar.DiscourseSearch(heuristic=<function <lambda>>, shared=None, queue_size=None)¶

Bases: attelo.optimisation.astar.Search

subtype of astar search for discourse: should be the same for every astar decoder, provided the discourse state is a subclass of DiscourseState

recover solution should be as is, provided a state has at least the following info: - parent: parent state - _link: the actual prediction made at this stage (1 state = 1 relation = (du1, du2, relation)

new_state(data)¶

recover_solution(endstate)¶: follow back pointers to collect list of chosen relations on edus.

class attelo.decoding.astar.DiscourseState(data, heuristics, shared)¶

Bases: attelo.optimisation.astar.State

Natural reading order decoding: incremental building of tree in order of text (one edu at a time)

instance of discourse graph with probability for each attachement+relation on a subset of edges.

implements the State interface to be used by Search

strategy: at each step of exploration choose a relation between two edus related by probability distribution, reading order a.k.a NRO “natural reading order”, cf Bramsen et al., 2006. in temporal processing.

‘data’ is set of instantiated relations (typically nothing at the beginning, but could be started with a few chosen relations)

‘shared’ points to shared data between states (here proba distribution between considered pairs of edus at least, but also can include precomputed info for heuristics)

h_average()¶: return the average probability possible when n nodes still need to be attached assuming the best overall prob in the distrib

h_best()¶: return the best probability possible when n nodes still need to be attached assuming the best overall prob in the distrib

h_best_overall()¶: return the best probability possible when n nodes still need to be attached assuming the best overall prob in the distrib

h_zero()¶: always 0

is_solution()¶

next_states()¶: must return a state and a cost TODO: adapt to disc parse, according to choice made for data -> especially update to RFC

proba(edu_pair)¶

return the label and probability that an edu pair are attached, or (“no”, None) if we don’t have a prediction for the pair

Return type:	(string, float or None)

shared()¶: information shared between states

strategy()¶: full or not, if the RFC is applied to labelled edu pairs

class attelo.decoding.astar.Heuristic¶

Bases: enum.Enum

Heuristic cost to guide A* search with

zero: see DiscourseState.h_zero
max: see DiscourseState.h_best_overall
best: see DiscourseState.h_best
average: see DiscourseState.h_average

average = <Heuristic.average: 3>¶

best = <Heuristic.best: 2>¶

max = <Heuristic.max: 1>¶

zero = <Heuristic.zero: 0>¶

class attelo.decoding.astar.RfcConstraint¶

Bases: enum.Enum

What sort of right frontier constraint to apply during decoding:

simple: every relation is treated as subordinating

full: (falls back to simple in case of unlabelled prediction)

full = <RfcConstraint.full: 2>¶

none = <RfcConstraint.none: 3>¶

simple = <RfcConstraint.simple: 1>¶

class attelo.decoding.astar.TwoStageNRO¶

Bases: attelo.decoding.astar.DiscourseState

similar as above with different handling of inter-sentence and intra-sentence relations

next_states()¶: must return a state and a cost

same_sentence(edu1, edu2)¶: not implemented: will always return False TODO: this should go in preprocessing before launching astar ?? would it be easier to have access to all edu pair features ?? (certainly for that one)

class attelo.decoding.astar.TwoStageNROData(parent=None, accessible=None, tolink=None)¶

Bases: attelo.decoding.astar.DiscData

similar as above with different handling of inter-sentence and intra-sentence relations

accessible is list of starting edus (only one for now)

accessible()¶: wip:

link(to_edu, from_edu, relation)¶: WIP

update_mode()¶: switch between intra/inter-sentential parsing mode

attelo.decoding.astar.preprocess_heuristics(cands)¶

precompute a set of useful information used by heuristics, such as

best probability
table of best probability when attaching a node, indexed on that node

format of cands is format given in main decoder: a list of (arg1,arg2,proba,best_relation)

attelo.decoding.baseline module¶

Baseline decoders

class attelo.decoding.baseline.LastBaseline¶

Bases: attelo.decoding.interface.Decoder

attach to last, always

decode(dpack, nonfixed_pairs=None)¶

class attelo.decoding.baseline.LocalBaseline(threshold, use_prob=True)¶

Bases: attelo.decoding.interface.Decoder

just attach locally if prob is > threshold

decode(dpack, nonfixed_pairs=None)¶

attelo.decoding.greedy module¶

Implementation of the locally greedy approach similar with DuVerle & Predinger (2009, 2010) (but adapted for SDRT, where the notion of adjacency includes embedded segments)

July 2012

@author: stergos

class attelo.decoding.greedy.LocallyGreedy¶

Bases: attelo.decoding.interface.Decoder

The locally greedy decoder

decode(dpack)¶

class attelo.decoding.greedy.LocallyGreedyState(instances)¶

Bases: object

the mutable parts of the locally greedy algorithm

decode()¶

Run the decoder

:rtype [(EDU, EDU, string)]

attelo.decoding.greedy.are_strictly_adjacent(one, two, edus)¶

returns True in the following cases

[one] [two]
[two] [one]

in the rest of the cases (when there is an edu between one and two) it returns False

attelo.decoding.greedy.get_neighbours(edus)¶

Return a mapping from each EDU to its neighbours

Return type:	Dict Edu [Edu]

attelo.decoding.greedy.is_embedded(one, two)¶

returns True when one is embedded in two, that is

[two ... [one] ... ]

returns False in all other cases

attelo.decoding.interface module¶

Common interface that all decoders must implement

class attelo.decoding.interface.Decoder¶

Bases: attelo.parser.interface.Parser

A decoder is a function which given a probability distribution (see below) and some control parameters, returns a sequence of predictions.

Most decoders only really return one prediction in practice, but some, like the A* decoder might have able to return a ranked sequence of the “N best” predictions it can find

We have a few informal types to consider here:

a link ((string, string, string)) represents a link between a pair of EDUs. The first two items are their identifiers, and the third is the link label

a candidate link (or candidate, to be short, (EDU, EDU, float, string)) is a link with a probability attached

a prediction is morally a set (in practice a list) of links

a distribution is morally a set of proposed links

Note that a decoder could also be seen/used as a sort of crude parser (with a fit function is a no-op). You’ll likely want to prefix it with a parser that extracts weights from datapacks lest you work with the somewhat unformative 1.0s everywhere.

decode(dpack)¶: Return the N-best predictions in the form of a datapack per prediction.

fit(dpacks, targets, nonfixed_pairs=None, cache=None)¶

transform(dpack, nonfixed_pairs=None)¶

attelo.decoding.local module¶

Local decoders make decisions for each edge independently.

class attelo.decoding.local.AsManyDecoder¶

Bases: attelo.decoding.interface.Decoder

Greedy decoder that picks as many edges as there are real EDUs.

The output structure is a graph that has the same number of edges as a spanning tree over the EDUs. It can be non-connex, contain cycles and re-entrancies.

decode(dpack)¶: Return the set of top N edges

class attelo.decoding.local.BestIncomingDecoder¶

Bases: attelo.decoding.interface.Decoder

Greedy decoder that picks the best incoming edge for each EDU.

The output structure is a graph that contains exactly one incoming edge for each EDU, thus it has the same number of edges as a spanning tree over the EDUs. It can be non-connex or contain cycles, but no re-entrancy.

decode(dpack)¶: Return the best incoming edge for each EDU

attelo.decoding.mst module¶

Created on Jun 27, 2012

@author: stergos, jrmyp

class attelo.decoding.mst.MsdagDecoder(root_strategy, use_prob=True)¶

Bases: attelo.decoding.mst.MstDecoder

Attach according to MSDAG (subgraph of original)

decode(dpack, nonfixed_pairs=None)¶

class attelo.decoding.mst.MstDecoder(root_strategy, use_prob=True)¶

Bases: attelo.decoding.interface.Decoder

Attach in such a way that the resulting subgraph is a maximum spanning tree of the original

decode(dpack, nonfixed_pairs=None)¶

class attelo.decoding.mst.MstRootStrategy¶

Bases: attelo.util.ArgparserEnum

How we declare the MST root node

fake_root = <MstRootStrategy.fake_root: 1>¶

leftmost = <MstRootStrategy.leftmost: 2>¶

attelo.decoding.util module¶

Utility classes functions shared by decoders

exception attelo.decoding.util.DecoderException¶

Bases: exceptions.Exception

Exceptions that arise during the decoding process

attelo.decoding.util.cap_score(score)¶

Cap a real-valued score between MIN_SCORE and MAX_SCORE.

The current default values for MIN_SCORE and MAX_SCORE follow the requirements from the decoders: * The MST decoder uses the depparse package whose MST implementation has a hardcoded minimum score of -1e100 ; Feeding it lower weights crashes the algorithm. Combined scores can’t reach the limit unless we have more than 1e10 nodes. * The Eisner decoder internally uses float64 scores.

Parameters:	score (float) – Original score.
Returns:	bounded_score – Score bounded to [MIN_SCORE, MAX_SCORE].
Return type:	float

attelo.decoding.util.convert_prediction(dpack, triples)¶

Populate a datapack prediction array from a list of triples

Parameters:	prediction ([(string, string, string)]) – List of EDU id, EDU id, label triples
Returns:	dpack – A copy of the original DataPack with predictions set
Return type:	DataPack

attelo.decoding.util.get_prob_map(instances)¶

Reformat a probability distribution as a dictionary from edu id pairs to a (relation, probability) tuples

:rtype dict (string, string) (string, float)

attelo.decoding.util.get_sorted_edus(instances)¶

Return a list of EDUs, using the following as sort key in order of

starting position (earliest edu first)
ending position (narrowest edu first)

Note that there may be EDU pairs with the same spans (particularly in case of annotation error). In case of ties, the order should be considered arbitrary

attelo.decoding.util.prediction_to_triples(dpack)¶

Returns:	triples – List of EDU id, EDU id, label triples omitting the unrelated triples
Return type:	prediction: [(string, string, string)]

attelo.decoding.util.simple_candidates(dpack)¶: Translate the links into a list of (EDU, EDU, float, string) quadruplets representing the attachment probability and the the best label for each EDU pair. This is often good enough for simplistic decoders

attelo.decoding.window module¶

A “pruning” decoder that pre-processes candidate edges and prunes them away if they are separated by more than a certain number of EDUs

class attelo.decoding.window.WindowPruner(window)¶

Bases: attelo.decoding.interface.Decoder

Notes

We assume that the datapack includes every EDU in its grouping.

If there are any gaps, the window will be a bit messed up

As decoders are parsers like any other, if you just want to apply this as preprocessing to a decoder, you could construct a mini pipeline consisting of this plus the decoder. Alternatively, if you already have a larger pipeline of which the decoder is already part, you can just insert this before the decoder.

decode(dpack)¶