util / ExtractText / 0.1.1

Extract Text

Royalty Free
Extracts text from a given file
API Calls - 12,801 Avg call duration - N/A
This is not a recognized license.
This is necessary for algorithms that rely on external services, however it also implies that this algorithm is able to send your input data outside of the Algorithmia platform.

Run an example

"The Guide for Writing Word Documents in Microsoft Word for EasyChair Publication
Andrei Voronkov1[footnoteRef:1] and Kry?tof Hoder1[footnoteRef:2] [1:  Masterminded EasyChair and created the first stable version of this document]  [2:  Created the first draft of this document] 

1 University of Manchester, Manchester, U.K.
andrei@voronkov.com, hoderk@cs.man.ac.uk
In order to ease the lives of authors, editors, and trees, we present a manual and an example of the use of Microsoft Word and similar tools for creating documents for EasyChair-based electronic and on-paper publishing of workshop and conference proceedings.
The styles and parameters of this guide are designed for compliance with the requirements for publication in the EasyChair conference system (Voronkov, 2004), and are also applicable to the Procedia publications series by Elsevier Science. EasyChair is a free conference management system that is flexible, easy to use, and has many features to make it suitable for various conference models. It is currently probably the most commonly used conference management system (Voronkov, 2004). The use of EasyChair and this style for creating Procedia volumes is a pilot project between Elsevier Science and EasyChair.
EasyChair publications accept documents written either in LaTeX or using a docx document format, which can be produced by Microsoft Word or LibreOffice. This guide explains how to produce the docx format in Microsoft Word. To achieve high quality of volumes, both LaTeX and Word documents should have the same layout and similar styles. This guide is provided for the users of Microsoft Word and describes EasyChair style version 3.4. 
To produce a document complying with the EasyChair and Procedia style you can simply take this guide and modify it. Several Word document styles are defined and used in this guide. To apply these styles in Microsoft Word you can use the Formatting Palette. The first line should contain the title of the document and be in the style Title. This line should be followed by a line in the style Authors which specifies authors of the publication, separated by commas, except for the last author which is separated from the rest of the authors by an ?and?. Next to each author there may be a super-script number linking the author to an organization (see above) and a footnote specifying the role of the author in preparation of the publication.
The author line should be followed by lines in the style Institute which provide the organizations with which the authors are affiliated. After each institute there are contact details of the authors in the style Monospaced.
The abstract should be preceded by a line in the style Abstract title containing the word ?Abstract?, and the abstract itself should use the style Abstract.
Styles for the Article Body
Section headers should use the style Section, while subsection headers use the style Subsection. For example, this text is part of Section 2 (Styles) and Subsection 2.1 (Styles for the Article Body).
The main text of the document should be written using the style Normal. For mono-spaced parts of the text (such as source code listings) we provide the style Monospaced. For a sans serif font, use the style Sans-Serif. In this guide, we use the sans serif style for the names of Word styles.
While section headers should use the style Section, the header of the References section should be in the style References. This style is similar to the style Section, but it is not numbered, again to resemble the EasyChair LaTeX style, including its procedia option.
Adding Citations
For citations it is recommended that you use the bibliography mechanisms of Microsoft Word or other tools able to process docx documents. In Microsoft Word, commands for inserting citations are located at the Document Elements tab of the ribbon control in the section References. Clicking the Manage button opens a toolbox, which allows you to add referenced publications and to insert them at the position of the cursor by double-clicking on them. If you use Microsoft Word for Mac, open the main toolbox and select the Citation tab instead. From there you can add and insert referenced publications as in other versions of Word. 
The references (or the bibliography) section of the article is created by clicking on the Bibliography button in the References section of the ribbon control mentioned above (or under Bibliographies in the Document Elements tab of the ribbon control in Microsoft Word for Mac). After inserting this section, the style of its header should be set to References.
The automatically generated References section may need to be explicitly updated to reflect further changes done in the bibliography. When clicking on the automatically generated text of the section, a Bibliography button will appear in the text and offer a menu with an update command.
This feature is not available in Microsoft Word 2003, so if you are reading this guide as a Word 2003 XML document, citations and the bibliography will appear as a static text and will not be updated automatically.
Adding Figures and Tables
In Microsoft Word, pictures can be inserted into the document by going to Insert->Picture->From File? on the menu and selecting the desired file. To simplify working with the image, it is recommended that you insert the picture into a text box. In order to make it into a figure and add a caption, select the image by clicking on it and then go to Insert->Caption? (or Insert->Reference->Caption? in earlier versions of Microsoft Word). From here, you can select the position of the caption (this should be set to below the image) and edit the text within it. Make sure that ?Figure? is selected in the ?Label? drop-down list and click ?OK? to generate it. Captions are numbered automatically in sequential order. Figure 1 is an example of a captioned image. 
If you have a table in your document, captions can be created in the same way, just select ?Table? from the ?Label? drop-down list instead. Table 1 shows an example of a table of data that was conveniently available. 

 Why one should use EasyChairIn order to cross-reference a figure or table in your text, go to Insert->Cross-reference? (or Insert->Reference->Cross-reference? in earlier versions) from the Word menu. Select the type of object you are referencing in the ?Reference type? drop-down list, and then select which object you are referencing under ?For which caption?. Ensure that the ?Insert as hyperlink? box is ticked. You can choose how much of the caption is inserted in the ?Insert reference to? drop-menu. For example, to generate this cross-reference for Figure 1, ?Only label and number? was selected. 
EasyChair Style Requirements
The layout settings of this guide are set to conform to the requirements of EasyChair and Procedia publishing. If the settings are altered, the requirements below need to be kept in mind, since papers deviating from the formatting standards will not be accepted for printing. This section is mostly for your information since the best way to produce a conforming document is by modifying this guide.
The default paper size is US letter, but A4 or letter sizes are also acceptable.
The print area for all of the acceptable paper sizes is 145x224 mm. This size has been selected to allow for inexpensive printing using our current print-on-demand publisher.
The base font is Times New Roman, and the sans-serif font is Helvetica. The base font size is 10pt. If you use any other font size, there is no guarantee that the produced document will look nice or fit into our standard page size.
PNG, JPG, and PDF images are supported. If the papers are designed for publishing in print, the images should be at least 300dpi in resolution.

ATP System
LTB /100
CYC /35
MZR /40
SMO /25
Vampire-LTB 11.0
E-LTB 1.1pre
EP-LTB 1.1pre
E-KRH'-LTB 1.1.3
LTB division resultsYou will see if your article deviates from the EasyChair style page margins when you submit your article in the EasyChair or Procedia proceedings.
Bug Reports
Please report bugs, errors, and omissions you find in this guide to its current maintainer, Andrei Voronkov, at andrei@easychair.org. Any constructive feedback is welcome.
Submitting Your Article Through EasyChair
This section is intended only for the authors and editors of EasyChair proceedings and Procedia volumes. When you prepare an article for either of these, it should be submitted through EasyChair. EasyChair automates the submission process as much as possible and goes to a great length to ensure that your article can be published and printed. Publication for EasyChair means much more than just putting a PDF of your article online. It collects meta-information about the article to classify it, find similar articles, make it easily searchable, and index it in various Web services such as DBLP.
Please note that, when your conference is hosted on EasyChair, your article can be submitted twice: first for the preliminary submission and reviewing and then for inclusion in the proceedings. You will see that EasyChair has different environments and different interfaces for these two stages. You can tell which one you are in by your role displayed in the upper left corner of the screen. For the submission phase, it should display ?author? and for the proceedings stage ?proceedings author?. If you are asked to submit your final version and you only see the author role, then something is very wrong and you should contact your conference organizers or volume editors.
Submitting the Article for Reviewing
It is up to your conference in what format you submit your article. Most likely, they will simply ask for a PDF file. You will see the list of accepted file types on the EasyChair submission page. Nonetheless, if your conference will publish proceedings as an EasyChair or as a Procedia volume, we recommend that you use the EasyChair style for the submission too. The reason is that otherwise, if your paper is accepted for the proceedings, you will have to reformat it using the EasyChair style and this can be time-consuming. If you need to submit a PDF document, you can convert your docx document to PDF by choosing File->Save As? in your Word menu.
Submitting the Final Version for the Proceedings
EasyChair will ask you to submit two documents: the original docx document and the PDF document. The PDF document must be obtained by saving your Word docx document as a PDF. After your submission, EasyChair will show your PDF document with a frame around it to ensure that you did not change the style so that the content of your document goes out of the style margins. It will also ask you to enter some meta-data about your article, such as the title, list of authors, and the abstract.  This meta-data is very important since it will appear on the Web page where your article is published, therefore EasyChair will ask you to check that it is consistent with the data in the article itself.
Voronkov, A. (2004). EasyChair conference system. Retrieved from easychair.org


Install and use


Install the Algorithmia CLI client by running:

curl -sSLf https://algorithmia.com/install.sh | sh

Then authenticate by running:

          $ algo auth
# When prompted for api endpoint, hit enter
# When prompted for API key, enter your key: YOUR_API_KEY
CLI install docs


    algo run util/ExtractText/0.1.1 -d '"data://util/SampleCollection/easychair.docx"' --timeout 300
CLI docs