--- _id: '10737' abstract: - lang: eng text: We consider two models for the sequence labeling (tagging) problem. The first one is a Pattern-Based Conditional Random Field (PB), in which the energy of a string (chain labeling) x=x1⁢…⁢xn∈Dn is a sum of terms over intervals [i,j] where each term is non-zero only if the substring xi⁢…⁢xj equals a prespecified word w∈Λ. The second model is a Weighted Context-Free Grammar (WCFG) frequently used for natural language processing. PB and WCFG encode local and non-local interactions respectively, and thus can be viewed as complementary. We propose a Grammatical Pattern-Based CRF model (GPB) that combines the two in a natural way. We argue that it has certain advantages over existing approaches such as the Hybrid model of Benedí and Sanchez that combines N-grams and WCFGs. The focus of this paper is to analyze the complexity of inference tasks in a GPB such as computing MAP. We present a polynomial-time algorithm for general GPBs and a faster version for a special case that we call Interaction Grammars. article_processing_charge: No article_type: original author: - first_name: Rustem full_name: Takhanov, Rustem id: 2CCAC26C-F248-11E8-B48F-1D18A9856A87 last_name: Takhanov - first_name: Vladimir full_name: Kolmogorov, Vladimir id: 3D50B0BA-F248-11E8-B48F-1D18A9856A87 last_name: Kolmogorov citation: ama: Takhanov R, Kolmogorov V. Combining pattern-based CRFs and weighted context-free grammars. Intelligent Data Analysis. 2022;26(1):257-272. doi:10.3233/IDA-205623 apa: Takhanov, R., & Kolmogorov, V. (2022). Combining pattern-based CRFs and weighted context-free grammars. Intelligent Data Analysis. IOS Press. https://doi.org/10.3233/IDA-205623 chicago: Takhanov, Rustem, and Vladimir Kolmogorov. “Combining Pattern-Based CRFs and Weighted Context-Free Grammars.” Intelligent Data Analysis. IOS Press, 2022. https://doi.org/10.3233/IDA-205623. ieee: R. Takhanov and V. Kolmogorov, “Combining pattern-based CRFs and weighted context-free grammars,” Intelligent Data Analysis, vol. 26, no. 1. IOS Press, pp. 257–272, 2022. ista: Takhanov R, Kolmogorov V. 2022. Combining pattern-based CRFs and weighted context-free grammars. Intelligent Data Analysis. 26(1), 257–272. mla: Takhanov, Rustem, and Vladimir Kolmogorov. “Combining Pattern-Based CRFs and Weighted Context-Free Grammars.” Intelligent Data Analysis, vol. 26, no. 1, IOS Press, 2022, pp. 257–72, doi:10.3233/IDA-205623. short: R. Takhanov, V. Kolmogorov, Intelligent Data Analysis 26 (2022) 257–272. date_created: 2022-02-06T23:01:32Z date_published: 2022-01-14T00:00:00Z date_updated: 2023-08-02T14:09:41Z day: '14' department: - _id: VlKo doi: 10.3233/IDA-205623 external_id: arxiv: - '1404.5475' isi: - '000749997700015' intvolume: ' 26' isi: 1 issue: '1' language: - iso: eng main_file_link: - open_access: '1' url: https://arxiv.org/abs/1404.5475 month: '01' oa: 1 oa_version: Preprint page: 257-272 publication: Intelligent Data Analysis publication_identifier: eissn: - 1571-4128 issn: - 1088-467X publication_status: published publisher: IOS Press quality_controlled: '1' scopus_import: '1' status: public title: Combining pattern-based CRFs and weighted context-free grammars type: journal_article user_id: 4359f0d1-fa6c-11eb-b949-802e58b17ae8 volume: 26 year: '2022' ... --- _id: '1794' abstract: - lang: eng text: We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) (Formula presented.) is the sum of terms over intervals [i, j] where each term is non-zero only if the substring (Formula presented.) equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.) where L is the combined length of input patterns, (Formula presented.) is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively (Formula presented.), (Formula presented.) and (Formula presented.), where (Formula presented.) is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights. acknowledgement: This work has been partially supported by the European Research Council under the European Unions Seventh Framework Programme (FP7/2007-2013)/ERC grant agreement no. 616160. author: - first_name: Vladimir full_name: Kolmogorov, Vladimir id: 3D50B0BA-F248-11E8-B48F-1D18A9856A87 last_name: Kolmogorov - first_name: Rustem full_name: Takhanov, Rustem id: 2CCAC26C-F248-11E8-B48F-1D18A9856A87 last_name: Takhanov citation: ama: Kolmogorov V, Takhanov R. Inference algorithms for pattern-based CRFs on sequence data. Algorithmica. 2016;76(1):17-46. doi:10.1007/s00453-015-0017-7 apa: Kolmogorov, V., & Takhanov, R. (2016). Inference algorithms for pattern-based CRFs on sequence data. Algorithmica. Springer. https://doi.org/10.1007/s00453-015-0017-7 chicago: Kolmogorov, Vladimir, and Rustem Takhanov. “Inference Algorithms for Pattern-Based CRFs on Sequence Data.” Algorithmica. Springer, 2016. https://doi.org/10.1007/s00453-015-0017-7. ieee: V. Kolmogorov and R. Takhanov, “Inference algorithms for pattern-based CRFs on sequence data,” Algorithmica, vol. 76, no. 1. Springer, pp. 17–46, 2016. ista: Kolmogorov V, Takhanov R. 2016. Inference algorithms for pattern-based CRFs on sequence data. Algorithmica. 76(1), 17–46. mla: Kolmogorov, Vladimir, and Rustem Takhanov. “Inference Algorithms for Pattern-Based CRFs on Sequence Data.” Algorithmica, vol. 76, no. 1, Springer, 2016, pp. 17–46, doi:10.1007/s00453-015-0017-7. short: V. Kolmogorov, R. Takhanov, Algorithmica 76 (2016) 17–46. date_created: 2018-12-11T11:54:02Z date_published: 2016-09-01T00:00:00Z date_updated: 2023-10-17T09:51:31Z day: '01' department: - _id: VlKo doi: 10.1007/s00453-015-0017-7 ec_funded: 1 external_id: arxiv: - '1210.0508' intvolume: ' 76' issue: '1' language: - iso: eng main_file_link: - open_access: '1' url: http://arxiv.org/abs/1210.0508 month: '09' oa: 1 oa_version: Preprint page: 17 - 46 project: - _id: 25FBA906-B435-11E9-9278-68D0E5697425 call_identifier: FP7 grant_number: '616160' name: 'Discrete Optimization in Computer Vision: Theory and Practice' publication: Algorithmica publication_status: published publisher: Springer publist_id: '5316' quality_controlled: '1' related_material: record: - id: '2272' relation: earlier_version status: public scopus_import: 1 status: public title: Inference algorithms for pattern-based CRFs on sequence data type: journal_article user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 76 year: '2016' ... --- _id: '2272' abstract: - lang: eng text: "We consider Conditional Random Fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) x1...xn is the sum of terms over intervals [i,j] where each term is non-zero only if the substring xi...xj equals a prespecified pattern α. Such CRFs can be naturally applied to many sequence tagging problems.\r\nWe present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively O(nL), O(nLℓmax) and O(nLmin{|D|,log(ℓmax+1)}) where L is the combined length of input patterns, ℓmax is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of (Ye et al., 2009) whose complexities are respectively O(nL|D|), O(n|Γ|L2ℓ2max) and O(nL|D|), where |Γ| is the number of input patterns.\r\nIn addition, we give an efficient algorithm for sampling. Finally, we consider the case of non-positive weights. (Komodakis & Paragios, 2009) gave an O(nL) algorithm for computing the MAP. We present a modification that has the same worst-case complexity but can beat it in the best case. " alternative_title: - JMLR article_processing_charge: No author: - first_name: Rustem full_name: Takhanov, Rustem id: 2CCAC26C-F248-11E8-B48F-1D18A9856A87 last_name: Takhanov - first_name: Vladimir full_name: Kolmogorov, Vladimir id: 3D50B0BA-F248-11E8-B48F-1D18A9856A87 last_name: Kolmogorov citation: ama: 'Takhanov R, Kolmogorov V. Inference algorithms for pattern-based CRFs on sequence data. In: ICML’13 Proceedings of the 30th International Conference on International. Vol 28. ML Research Press; 2013:145-153.' apa: 'Takhanov, R., & Kolmogorov, V. (2013). Inference algorithms for pattern-based CRFs on sequence data. In ICML’13 Proceedings of the 30th International Conference on International (Vol. 28, pp. 145–153). Atlanta, GA, USA: ML Research Press.' chicago: Takhanov, Rustem, and Vladimir Kolmogorov. “Inference Algorithms for Pattern-Based CRFs on Sequence Data.” In ICML’13 Proceedings of the 30th International Conference on International, 28:145–53. ML Research Press, 2013. ieee: R. Takhanov and V. Kolmogorov, “Inference algorithms for pattern-based CRFs on sequence data,” in ICML’13 Proceedings of the 30th International Conference on International, Atlanta, GA, USA, 2013, vol. 28, no. 3, pp. 145–153. ista: 'Takhanov R, Kolmogorov V. 2013. Inference algorithms for pattern-based CRFs on sequence data. ICML’13 Proceedings of the 30th International Conference on International. ICML: International Conference on Machine Learning, JMLR, vol. 28, 145–153.' mla: Takhanov, Rustem, and Vladimir Kolmogorov. “Inference Algorithms for Pattern-Based CRFs on Sequence Data.” ICML’13 Proceedings of the 30th International Conference on International, vol. 28, no. 3, ML Research Press, 2013, pp. 145–53. short: R. Takhanov, V. Kolmogorov, in:, ICML’13 Proceedings of the 30th International Conference on International, ML Research Press, 2013, pp. 145–153. conference: end_date: 2013-06-21 location: Atlanta, GA, USA name: 'ICML: International Conference on Machine Learning' start_date: 2013-06-16 date_created: 2018-12-11T11:56:41Z date_published: 2013-06-01T00:00:00Z date_updated: 2023-10-17T09:51:32Z day: '01' department: - _id: VlKo intvolume: ' 28' issue: '3' language: - iso: eng main_file_link: - open_access: '1' url: http://proceedings.mlr.press/v28/takhanov13.pdf?CFID=105472548&CFTOKEN=5c5859b5d97b4439-27B4AC58-BA92-A964-B598CAACEE6CC515 month: '06' oa: 1 oa_version: Submitted Version page: 145 - 153 publication: ICML'13 Proceedings of the 30th International Conference on International publication_status: published publisher: ML Research Press publist_id: '4672' quality_controlled: '1' related_material: record: - id: '1794' relation: later_version status: public scopus_import: '1' status: public title: Inference algorithms for pattern-based CRFs on sequence data type: conference user_id: 2DF688A6-F248-11E8-B48F-1D18A9856A87 volume: 28 year: '2013' ...