---
_id: '10737'
abstract:
- lang: eng
text: We consider two models for the sequence labeling (tagging) problem. The first
one is a Pattern-Based Conditional Random Field (PB), in which the energy of a
string (chain labeling) x=x1…xn∈Dn is a sum of terms over intervals [i,j] where
each term is non-zero only if the substring xi…xj equals a prespecified word
w∈Λ. The second model is a Weighted Context-Free Grammar (WCFG) frequently used
for natural language processing. PB and WCFG encode local and non-local interactions
respectively, and thus can be viewed as complementary. We propose a Grammatical
Pattern-Based CRF model (GPB) that combines the two in a natural way. We argue
that it has certain advantages over existing approaches such as the Hybrid model
of Benedí and Sanchez that combines N-grams and WCFGs. The focus of this paper
is to analyze the complexity of inference tasks in a GPB such as computing MAP.
We present a polynomial-time algorithm for general GPBs and a faster version for
a special case that we call Interaction Grammars.
article_processing_charge: No
article_type: original
author:
- first_name: Rustem
full_name: Takhanov, Rustem
id: 2CCAC26C-F248-11E8-B48F-1D18A9856A87
last_name: Takhanov
- first_name: Vladimir
full_name: Kolmogorov, Vladimir
id: 3D50B0BA-F248-11E8-B48F-1D18A9856A87
last_name: Kolmogorov
citation:
ama: Takhanov R, Kolmogorov V. Combining pattern-based CRFs and weighted context-free
grammars. Intelligent Data Analysis. 2022;26(1):257-272. doi:10.3233/IDA-205623
apa: Takhanov, R., & Kolmogorov, V. (2022). Combining pattern-based CRFs and
weighted context-free grammars. Intelligent Data Analysis. IOS Press. https://doi.org/10.3233/IDA-205623
chicago: Takhanov, Rustem, and Vladimir Kolmogorov. “Combining Pattern-Based CRFs
and Weighted Context-Free Grammars.” Intelligent Data Analysis. IOS Press,
2022. https://doi.org/10.3233/IDA-205623.
ieee: R. Takhanov and V. Kolmogorov, “Combining pattern-based CRFs and weighted
context-free grammars,” Intelligent Data Analysis, vol. 26, no. 1. IOS
Press, pp. 257–272, 2022.
ista: Takhanov R, Kolmogorov V. 2022. Combining pattern-based CRFs and weighted
context-free grammars. Intelligent Data Analysis. 26(1), 257–272.
mla: Takhanov, Rustem, and Vladimir Kolmogorov. “Combining Pattern-Based CRFs and
Weighted Context-Free Grammars.” Intelligent Data Analysis, vol. 26, no.
1, IOS Press, 2022, pp. 257–72, doi:10.3233/IDA-205623.
short: R. Takhanov, V. Kolmogorov, Intelligent Data Analysis 26 (2022) 257–272.
date_created: 2022-02-06T23:01:32Z
date_published: 2022-01-14T00:00:00Z
date_updated: 2023-08-02T14:09:41Z
day: '14'
department:
- _id: VlKo
doi: 10.3233/IDA-205623
external_id:
arxiv:
- '1404.5475'
isi:
- '000749997700015'
intvolume: ' 26'
isi: 1
issue: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: https://arxiv.org/abs/1404.5475
month: '01'
oa: 1
oa_version: Preprint
page: 257-272
publication: Intelligent Data Analysis
publication_identifier:
eissn:
- 1571-4128
issn:
- 1088-467X
publication_status: published
publisher: IOS Press
quality_controlled: '1'
scopus_import: '1'
status: public
title: Combining pattern-based CRFs and weighted context-free grammars
type: journal_article
user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8
volume: 26
year: '2022'
...
---
_id: '1794'
abstract:
- lang: eng
text: We consider Conditional random fields (CRFs) with pattern-based potentials
defined on a chain. In this model the energy of a string (labeling) (Formula presented.)
is the sum of terms over intervals [i, j] where each term is non-zero only if
the substring (Formula presented.) equals a prespecified pattern w. Such CRFs
can be naturally applied to many sequence tagging problems. We present efficient
algorithms for the three standard inference tasks in a CRF, namely computing (i)
the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities
are respectively (Formula presented.), (Formula presented.) and (Formula presented.)
where L is the combined length of input patterns, (Formula presented.) is the
maximum length of a pattern, and D is the input alphabet. This improves on the
previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively
(Formula presented.), (Formula presented.) and (Formula presented.), where (Formula
presented.) is the number of input patterns. In addition, we give an efficient
algorithm for sampling, and revisit the case of MAP with non-positive weights.
acknowledgement: This work has been partially supported by the European Research Council
under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant
agreement no. 616160.
author:
- first_name: Vladimir
full_name: Kolmogorov, Vladimir
id: 3D50B0BA-F248-11E8-B48F-1D18A9856A87
last_name: Kolmogorov
- first_name: Rustem
full_name: Takhanov, Rustem
id: 2CCAC26C-F248-11E8-B48F-1D18A9856A87
last_name: Takhanov
citation:
ama: Kolmogorov V, Takhanov R. Inference algorithms for pattern-based CRFs on sequence
data. Algorithmica. 2016;76(1):17-46. doi:10.1007/s00453-015-0017-7
apa: Kolmogorov, V., & Takhanov, R. (2016). Inference algorithms for pattern-based
CRFs on sequence data. Algorithmica. Springer. https://doi.org/10.1007/s00453-015-0017-7
chicago: Kolmogorov, Vladimir, and Rustem Takhanov. “Inference Algorithms for Pattern-Based
CRFs on Sequence Data.” Algorithmica. Springer, 2016. https://doi.org/10.1007/s00453-015-0017-7.
ieee: V. Kolmogorov and R. Takhanov, “Inference algorithms for pattern-based CRFs
on sequence data,” Algorithmica, vol. 76, no. 1. Springer, pp. 17–46, 2016.
ista: Kolmogorov V, Takhanov R. 2016. Inference algorithms for pattern-based CRFs
on sequence data. Algorithmica. 76(1), 17–46.
mla: Kolmogorov, Vladimir, and Rustem Takhanov. “Inference Algorithms for Pattern-Based
CRFs on Sequence Data.” Algorithmica, vol. 76, no. 1, Springer, 2016, pp.
17–46, doi:10.1007/s00453-015-0017-7.
short: V. Kolmogorov, R. Takhanov, Algorithmica 76 (2016) 17–46.
date_created: 2018-12-11T11:54:02Z
date_published: 2016-09-01T00:00:00Z
date_updated: 2023-10-17T09:51:31Z
day: '01'
department:
- _id: VlKo
doi: 10.1007/s00453-015-0017-7
ec_funded: 1
external_id:
arxiv:
- '1210.0508'
intvolume: ' 76'
issue: '1'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: http://arxiv.org/abs/1210.0508
month: '09'
oa: 1
oa_version: Preprint
page: 17 - 46
project:
- _id: 25FBA906-B435-11E9-9278-68D0E5697425
call_identifier: FP7
grant_number: '616160'
name: 'Discrete Optimization in Computer Vision: Theory and Practice'
publication: Algorithmica
publication_status: published
publisher: Springer
publist_id: '5316'
quality_controlled: '1'
related_material:
record:
- id: '2272'
relation: earlier_version
status: public
scopus_import: 1
status: public
title: Inference algorithms for pattern-based CRFs on sequence data
type: journal_article
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 76
year: '2016'
...
---
_id: '2272'
abstract:
- lang: eng
text: "We consider Conditional Random Fields (CRFs) with pattern-based potentials
defined on a chain. In this model the energy of a string (labeling) x1...xn is
the sum of terms over intervals [i,j] where each term is non-zero only if the
substring xi...xj equals a prespecified pattern α. Such CRFs can be naturally
applied to many sequence tagging problems.\r\nWe present efficient algorithms
for the three standard inference tasks in a CRF, namely computing (i) the partition
function, (ii) marginals, and (iii) computing the MAP. Their complexities are
respectively O(nL), O(nLℓmax) and O(nLmin{|D|,log(ℓmax+1)}) where L is the combined
length of input patterns, ℓmax is the maximum length of a pattern, and D is the
input alphabet. This improves on the previous algorithms of (Ye et al., 2009)
whose complexities are respectively O(nL|D|), O(n|Γ|L2ℓ2max) and O(nL|D|), where
|Γ| is the number of input patterns.\r\nIn addition, we give an efficient algorithm
for sampling. Finally, we consider the case of non-positive weights. (Komodakis
& Paragios, 2009) gave an O(nL) algorithm for computing the MAP. We present
a modification that has the same worst-case complexity but can beat it in the
best case. "
alternative_title:
- JMLR
article_processing_charge: No
author:
- first_name: Rustem
full_name: Takhanov, Rustem
id: 2CCAC26C-F248-11E8-B48F-1D18A9856A87
last_name: Takhanov
- first_name: Vladimir
full_name: Kolmogorov, Vladimir
id: 3D50B0BA-F248-11E8-B48F-1D18A9856A87
last_name: Kolmogorov
citation:
ama: 'Takhanov R, Kolmogorov V. Inference algorithms for pattern-based CRFs on sequence
data. In: ICML’13 Proceedings of the 30th International Conference on International.
Vol 28. ML Research Press; 2013:145-153.'
apa: 'Takhanov, R., & Kolmogorov, V. (2013). Inference algorithms for pattern-based
CRFs on sequence data. In ICML’13 Proceedings of the 30th International Conference
on International (Vol. 28, pp. 145–153). Atlanta, GA, USA: ML Research Press.'
chicago: Takhanov, Rustem, and Vladimir Kolmogorov. “Inference Algorithms for Pattern-Based
CRFs on Sequence Data.” In ICML’13 Proceedings of the 30th International Conference
on International, 28:145–53. ML Research Press, 2013.
ieee: R. Takhanov and V. Kolmogorov, “Inference algorithms for pattern-based CRFs
on sequence data,” in ICML’13 Proceedings of the 30th International Conference
on International, Atlanta, GA, USA, 2013, vol. 28, no. 3, pp. 145–153.
ista: 'Takhanov R, Kolmogorov V. 2013. Inference algorithms for pattern-based CRFs
on sequence data. ICML’13 Proceedings of the 30th International Conference on
International. ICML: International Conference on Machine Learning, JMLR, vol.
28, 145–153.'
mla: Takhanov, Rustem, and Vladimir Kolmogorov. “Inference Algorithms for Pattern-Based
CRFs on Sequence Data.” ICML’13 Proceedings of the 30th International Conference
on International, vol. 28, no. 3, ML Research Press, 2013, pp. 145–53.
short: R. Takhanov, V. Kolmogorov, in:, ICML’13 Proceedings of the 30th International
Conference on International, ML Research Press, 2013, pp. 145–153.
conference:
end_date: 2013-06-21
location: Atlanta, GA, USA
name: 'ICML: International Conference on Machine Learning'
start_date: 2013-06-16
date_created: 2018-12-11T11:56:41Z
date_published: 2013-06-01T00:00:00Z
date_updated: 2023-10-17T09:51:32Z
day: '01'
department:
- _id: VlKo
intvolume: ' 28'
issue: '3'
language:
- iso: eng
main_file_link:
- open_access: '1'
url: http://proceedings.mlr.press/v28/takhanov13.pdf?CFID=105472548&CFTOKEN=5c5859b5d97b4439-27B4AC58-BA92-A964-B598CAACEE6CC515
month: '06'
oa: 1
oa_version: Submitted Version
page: 145 - 153
publication: ICML'13 Proceedings of the 30th International Conference on International
publication_status: published
publisher: ML Research Press
publist_id: '4672'
quality_controlled: '1'
related_material:
record:
- id: '1794'
relation: later_version
status: public
scopus_import: '1'
status: public
title: Inference algorithms for pattern-based CRFs on sequence data
type: conference
user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87
volume: 28
year: '2013'
...