Wszystkie pola: ocr - Katalog OPAC zbiorów

Skocz do pozycji: 1.

Tytuł:: Multilevel correction of OCR of medical texts
Autorzy:: Piasecki, M.
Powiązania:: https://bibliotekanauki.pl/articles/333886.pdf
Data publikacji:: 2007
Wydawca:: Uniwersytet Śląski. Wydział Informatyki i Nauki o Materiałach. Instytut Informatyki. Zakład Systemów Komputerowych
Tematy:: pisma OCR
dokumenty medyczne
modele językowe
Polski
handwriting OCR
medical documents
language model
tagger parser
Polish
Opis:: In the paper the idea of the multilevel correction of the results handwriting OCR of medical texts is investigated. The correction is performed according to different levels of linguistic knowledge. Three types of models, namely: the n-gram Language Models of word form and base form sequences, the morpho-syntactic model based on a tagger and the model of correction by parsing are presented and their results are compared. The parsing model is based on the combination of a deterministic Czech parser adapted for Polish and the Structured Language Model based on lexicalised, binary parsing trees produced in the left-to-right manner. Contrary to the initial expectations, the best result of correction from 82% of the word level classifier to 92.98% of the overall accuracy was achieved with the help of a n-gram Language Models. The more rich description of language expressions in a model, the worse results were obtained. This result is in large extent caused by the specific characteristics of the processed medical documents.
Źródło:: Journal of Medical Informatics & Technologies; 2007, 11; 263-273
1642-6037
Pojawia się w:: Journal of Medical Informatics & Technologies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 2.

Tytuł:: Overconsolidation and microstructures in Neogene clays from the Warsaw area
Autorzy:: Kaczyński, R.
Powiązania:: https://bibliotekanauki.pl/articles/2059628.pdf
Data publikacji:: 2003
Wydawca:: Państwowy Instytut Geologiczny – Państwowy Instytut Badawczy
Tematy:: neogene clays
microstructural parameters
porous space
OCR
Opis:: The main objective of the study was determine the loading history and establish the current state of consolidation of Neogene clays, to study their lithological and microstructural properties, and to define their geological-engineering properties. To accomplish this task, series of laboratory and field tests were performed. The tests were made on clays taken from pits excavated for underground stations and tunnels (A-14-A-15) in Warsaw and from 2 borehole cores taken from the Stegny experimental field. The tests showed that: the clays are historically overconsolidated with an OCR ratio of 25-50 and their current state of preconsolidation is OCR = 2-14; their range of clay microstructures, observed for the first time, are matrix-turbulent and turbulent-laminar and there was a clear anisotropy of quantitative parameters of the pore paces, these parameters varying with depth. The engineering-geological characteristics (physical and mechanical properties) of the clays were assessed. The results of the study can be used directly to evaluate the Neogene clays of the Warsaw area for their suitability as a subsoil for engineering projects and indirectly to accomplish the same with other overconsolidated soils, particularly in regard to the study methodologies applied and described.
Źródło:: Geological Quarterly; 2003, 47, 1; 43-54
1641-7291
Pojawia się w:: Geological Quarterly
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 3.

Tytuł:: The IMPACT project Polish Ground-Truth texts as a Djvu corpus
Autorzy:: Bień, Janusz S.
Powiązania:: https://bibliotekanauki.pl/articles/677177.pdf
Data publikacji:: 2014
Wydawca:: Polska Akademia Nauk. Instytut Slawistyki PAN
Tematy:: Polish language
corpora
DjVu
OCR
PAGE
Page Analysis and Ground-Truth Elements
GNU GPL
Opis:: The IMPACT project Polish Ground-Truth texts as a Djvu corpusThe purpose of the paper is twofold. First, to describe the already implemented idea of DjVu corpora, i.e. corpora which consist of both scanned images and a transcription of the texts with the words associated with their occurrences in the scans. Secondly, to present a case study of a corpus consisting of almost 5 000 pages of Polish historical texts dating from 1570 to 1756 (it is practically the very first corpus of historical Polish). The tools described have universal character and are freely available under the GNU GPL license, hence they can be used also for other purposes.
Źródło:: Cognitive Studies; 2014, 14
2392-2397
Pojawia się w:: Cognitive Studies
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 4.

Tytuł:: Preprocessing Photos of Receipts for Recognition
Przetwarzanie wstępne zdjęć paragonów do celów rozpoznawania
Autorzy:: Korobacz, W.
Tabędzki, M.
Powiązania:: https://bibliotekanauki.pl/articles/88364.pdf
Data publikacji:: 2018
Wydawca:: Politechnika Białostocka. Oficyna Wydawnicza Politechniki Białostockiej
Tematy:: cyfrowe przetwarzanie obrazów
rozpoznawanie znaków
OCR
digital image processing
optical character recognition
Opis:: The subject of this work is methods of image pre-processing, applied to receipts photos. The purpose is to improve their quality, allowing to increase the efficiency of the conventional text recognition software (OCR). The authors had mainly difficult cases in mind – photos taken freehand in unfavorable lighting conditions. The work describes the analyzed methods of filtering, binarization, searching for the edge of the image, image straightening, marking the area of interest, thinning. The preliminary results with OCR software on a small data set were also presented. Thanks to pre-processing, character recognition efficiency has been improved by 25%. The final part presents conclusions and plans for future work.
Tematem tej pracy są metody przetwarzania wstępnego obrazów, zastosowane do zdjęć przedstawiających paragony. Celem jest poprawa ich jakości, pozwalająca zwiększyć skuteczność działania oprogramowania do rozpoznawania tekstu. Autorzy mieli na uwadze głównie trudne przypadki – zdjęć robionych „z ręki”, przy słabym oświetleniu. Praca opisuje przeanalizowane metody filtrowania, binaryzacji, wyszukiwania krawędzi, prostowania obrazu, oznaczania obszaru zainteresowania, ścieniania. Przedstawiono również wstępne wyniki testów z oprogramowaniem OCR na niewielkiej bazie obrazów. Przetwarzanie wstępne pozwoliło na poprawę identyfikacji znaków o 25%. W końcowej części przedstawiono wnioski oraz plany przyszłej pracy.
Źródło:: Advances in Computer Science Research; 2018, 14; 87-103
2300-715X
Pojawia się w:: Advances in Computer Science Research
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 5.

Tytuł:: An Approach to License Plate Recognition in Real Time Using Multi-stage Computational Intelligence Classifier
Autorzy:: Kekez, Michał
Powiązania:: https://bibliotekanauki.pl/articles/27311914.pdf
Data publikacji:: 2023
Wydawca:: Polska Akademia Nauk. Czasopisma i Monografie PAN
Tematy:: car license plates
LPR
ANPR
OCR
image processing
neural network
Random Forest
Opis:: Automatic car license plate recognition (LPR) is widely used nowadays. It involves plate localization in the image, character segmentation and optical character recognition. In this paper, a set of descriptors of image segments (characters) was proposed as well as a technique of multi-stage classification of letters and digits using cascade of neural network and several parallel Random Forest or classification tree or rule list classifiers. The proposed solution was applied to automated recognition of number plates which are composed of capital Latin letters and Arabic numerals. The paper presents an analysis of the accuracy of the obtained classifiers. The time needed to build the classifier and the time needed to classify characters using it are also presented.
Źródło:: International Journal of Electronics and Telecommunications; 2023, 69, 2; 275--280
2300-1933
Pojawia się w:: International Journal of Electronics and Telecommunications
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 6.

Tytuł:: Investigation of Normalization Techniques and Their Impact on a Recognition Rate in Handwritten Numeral Recognition
Autorzy:: Chmielnicki, Wieslaw
Stapor, Katarzyna
Powiązania:: https://bibliotekanauki.pl/articles/1373426.pdf
Data publikacji:: 2010
Wydawca:: Uniwersytet Jagielloński. Wydawnictwo Uniwersytetu Jagiellońskiego
Tematy:: handwritten numeral recognition
normalization techniques
SVM classifier
feature vectors
OCR
geometric invariants
Zernike moments
gradient features
Opis:: This paper presents several normalization techniques used in handwritten numeral recognition and their impact on recognition rates. Experiments with five different feature vectors based on geometric invariants, Zernike moments and gradient features are conducted. The recognition rates obtained using combination of these methods with gradient features and the SVM-rbf classifier are comparable to the best state-of-art techniques.
Źródło:: Schedae Informaticae; 2010, 19; 53-77
0860-0295
2083-8476
Pojawia się w:: Schedae Informaticae
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Skocz do pozycji: 7.

Tytuł:: Development of Extensive Polish Handwritten Characters Database for Text Recognition Research
Autorzy:: Tokovarov, Mikhail
Kaczorowska, Monika
Miłosz, Marek
Powiązania:: https://bibliotekanauki.pl/articles/102832.pdf
Data publikacji:: 2020
Wydawca:: Stowarzyszenie Inżynierów i Techników Mechaników Polskich
Tematy:: OCR
Handwriting character samples
Database for optical character recognition
Polish handwritten characters database
Próbki znaków pisma ręcznego
Baza danych do optycznego rozpoznawania znaków
Baza polskich znaków pisanych odręcznie
Opis:: In the modern world, fast and efficient processing of non-digital (handwritten or typed) texts is the task of extreme importance. Similar to many other fields, optical character recognition (OCR) benefits from the application of machine learning (ML) which allows developing effective and accurate methods. In order to achieve good performance, a machine learning algorithm requires great amount of data. Nowadays, a large database of handwritten characters prepared by National Institute of Standards and Technology (NIST), USA, can be used for training an ML model. However, significant differences between the manners of handwriting exist in the US and Poland. That fact, along with the absence of Polish diacritical marks, causes the NIST database to be less useful for development of an OCR model for the Polish language. According to the best of the authors’ knowledge, no database with samples of Polish handwriting exists. The present research is focused at filling this gap, i.e. gathering and preparing an extensive database of Polish handwritten characters. The paper presents the very first database of Polish handwriting samples. The database is by far larger than all the datasets used in the previous attempts of implementing OCR for the Polish handwriting. It is also the first fully publicly accessible database of Polish handwriting of this scale. The same method and developed tools can be used to build handwritten characters databases of other languages.
Źródło:: Advances in Science and Technology. Research Journal; 2020, 14, 3; 30-38
2299-8624
Pojawia się w:: Advances in Science and Technology. Research Journal
Dostawca treści:: Biblioteka Nauki

Artykuł

Zmień widok

na półce

Informacja

Wyszukujesz frazę "ocr" wg kryterium: Wszystkie pola

Źródło danych

Dostawca treści

Kolekcja

Rok wydania

Wydawca

Temat

Autor

Typ dokumentu

Język