Linux HowTo: Removing unnecessary plugins on adode acrobat reader dc to make the application faster

Original Source Link

I tried transferring some of the plugin (.api) files to another directory to speed up adobe acrobat dc pdf viewer.

Yet, the highlighting (annotation) functionality doesn’t seem to work after transferring some of the plugin files.

So, can anyone give me the list of plugins which are required for highlighting functionality to work so that I can get rid of the rest?

Tagged : /

Linux HowTo: How to get a PDF which converts an already drawn sample to uniform [closed]

Original Source Link

Suppose i have a large data pool with a particular PDF, $F(x)$, interval $[x,y]$ estimated from KDE of the datapool. I drew $N$ samples at random from that data pool and saw that their distribution is also represented quite well by $F(x)$. let this draw be $D_{bef}sim F(x)$

Now I want another distribution $G(x)$, on same interval, such that, if i draw another $N$ sample from $G(x)$, $D_{aft}sim G(x)$, then total $2N$ samples follow $(D_{bef} + D_{aft}) sim Uniform(x)$.

Is it possible?? A lot of questions here want to generate uniform from PDF, but I want to draw from a PDF which when combined with my original draw, will convert them uniform.

It’s not always possible: suppose that $F$ samples almost entirely from a subinterval $[a,b]$. The combined sample will be almost 50% from that subinterval, so if $b-a <frac{1}{2}(y-x)$ you can’t get uniformity on $[x-y]$.

This shows what you need: for every subset of $[x,y]$, $F$ must put no more than twice as much probability on it than the uniform distribution does. That is,
you need the density $f(s)$ of $F$ to be less than twice the uniform density $1/(y-x)$ everywhere on the interval. You can then sample from the distribution $G$ with density
$$g(s) = frac{2}{y-x}-f(s)$$
to get up to uniformity.

Tagged : / /

Making Game: How can I use Font Awesome vectors in Illustrator?

Original Source Link

I’m stumped. As directed, I printed the Font Awesome Cheatsheet to PDF. When I open it with Acrobat Reader, it looks fine. However, when I try to open it with Illustrator, I get this warning:

The font MuseoSlab-500 is missing.  Affected text will be displayed using a substitute font.
The font OTS-derived-font is missing.  Affected text will be displayed using a substitute font.
The font ProximaNova-Regular is missing.  Affected text will be displayed using a substitute font.

How can I ‘fix’ the PDF, so that I can see and use the icons in Illustrator?

Open the PDF in Acrobat Pro, Save as… to Encapculated Postscript (.eps)

It will create two separate files that can be opened in Adobe Illustrator. The icons and type should all be converted to outlines.

TRy using http://image.online-convert.com/convert-to-eps to convert to eps. then you can use in illustrator.

Tagged : / /

Linux HowTo: How can I use Font Awesome vectors in Illustrator?

Original Source Link

I’m stumped. As directed, I printed the Font Awesome Cheatsheet to PDF. When I open it with Acrobat Reader, it looks fine. However, when I try to open it with Illustrator, I get this warning:

The font MuseoSlab-500 is missing.  Affected text will be displayed using a substitute font.
The font OTS-derived-font is missing.  Affected text will be displayed using a substitute font.
The font ProximaNova-Regular is missing.  Affected text will be displayed using a substitute font.

How can I ‘fix’ the PDF, so that I can see and use the icons in Illustrator?

Open the PDF in Acrobat Pro, Save as… to Encapculated Postscript (.eps)

It will create two separate files that can be opened in Adobe Illustrator. The icons and type should all be converted to outlines.

TRy using http://image.online-convert.com/convert-to-eps to convert to eps. then you can use in illustrator.

Tagged : / /

Linux HowTo: Customizing Firefox’s built-in PDF.js with CSS

Original Source Link

I’m running Firefox 77 on Windows. I use Firefox’s built in PDF.js viewer as my default. However, I would like to make a modification to the CSS for the viewer (specifically, I’d like to change the

.pdfViewer .page {
...
margin: 1px auto -8px auto
...
border: 9px solid transparent
...
}

to margin: 1px auto -3px auto and border: 1px dashed transparent) .

How would I do this? I don’t think this is something for userChrome because it’s not part of the interface, yet I don’t see where the pdf.js code is stored (Search Everything has no relevant results for pdf.js, pdf.worker.js, or viewer.css). A userstyle/userscript probably won’t work since it’s an internal page, so I’m out of ideas. Can someone help me with this?

Edit: tried a userscript, didn’t work even though it showed that the script was active on the page. Probably that means that the userscripts can’t affect system files

It turns out that you can use userContent.css which will style actual pages. So, I copied

.pdfViewer .page {
    margin: 1px auto -3px auto !important;
    border: 1px dashed gray !important;
} 

into my userContent and it worked.

Tagged : / /

Linux HowTo: Why isn’t Adobe Acrobat’s ‘Map Colors’ fixup changing a PDF’s text colors?

Original Source Link

Please see the screenshot beneath. I’m using Adobe Acrobat Pro DC Version 2019.010.20064. The PDF in question can be downloaded here.

I’m trying to use mrserge’s method to convert the book’s default (C, M, Y, K) = (99%, 98%, 18%, 6%) (ie dark purple) to (90, 0, 90, 0) (green). After I click ‘OK’ and click ‘Fix’, Adobe Acrobat executes the fixup, but the color fails to change.

enter image description here

Changing a color with bare hands

I cannot provide instructions on Adobe Acrobat Pro, however, I can suggest a procedure with open source programs and some specific hints valid for the OP document, but maybe also useful for the other readers.

A PDF file, once uncompressed, is somehow human-readable.

  • So let’s first uncompress the file with pdftk:

    pdftk OriginalFile.pdf output Uncompressed.pdf uncompress
    
  • Then we can start taking a look inside, manually(!) making the changes … ok not really manually.
    We want to replace a color throughout the document.
    In the OP case and in the file after the decompression, the color is stored as rgb (0..1,0..1,0..1).

  • Instead of finding something like (C, M, Y, K) = (63%, 63%, 0%, 51%) we find the something like (r,g,b)=(0.181 0.181 0.488). In the OP file the occurencies are lines like

    /CS0 cs 0.181 0.181 0.488  scn
    
  • To be honest there is a lot of them (2048). With a single command we will replace them all at once. Here we use sed substituting each occurence of the string 0.181 0.181 0.488 with the string 0.102 1.000 0.102, globally (on all the file). Of course it is always possible to use any editor able to deal with binary files:

    sed 's/0.181 0.181 0.488/0.102 1.000 0.102/g' uncompressed.pdf > newfile.pdf
    

To selectively replace that color string only in some places, we need to identify where.


Notes

There are several different occurrences of that color string that act differently on the text.

   1 /CS1 cs 0.181 0.181 0.488  scn   # Logo on page 5 
  96 /CS0 CS 0.181 0.181 0.488  SCN   # No visual effect
 826 /CS0 cs 0.181 0.181 0.488  scn   # Headers, page n., some(not all) other parts
1095 0.181 0.181 0.488  scn           # The other text

It is possible to act singularly on them, e.g. with

sed 's/CS0 cs 0.181 0.181 0.488  scn/CS0 cs 0.900 0.000 0.000  scn/g' uncomp.pdf

Tagged : /

Server Bug Fix: Multi-tab pdf reader and a way of opening pdfs in the same tab but not from within the app?

Original Source Link

I’m using qpdfview and okular. They support tabs. But I do not see how to do the following:

a) open (double click) files and have them open in one window as tabs
b) “merge” all open windows into one with tabs.

Am looking for a reader that can open everything in one window or a way of making this happen via some config.

Yes there is a way:

for qpdfview:

  • copy /usr/share/applications/qpdfview.desktop to ~/.local/share/applications/qpdfview.desktop
  • edit ~/.local/share/applications/qpdfview.desktop so that the Exec line reads Exec=qpdfview --unique %f

for okular there might be a similar way yet i did not find one

For Okular you have to check this : Settings->Configure Okular->General->Program Features->Open new files in tabs.

Mendeley, primarily a reference manager, has a built-in pdf reader. PDF files open up in Tabs from within or outside the application. Just try it and it may well become your default PDF reader app due to speed and convenience.

Tagged : /

Code Bug Fix: Extract data from pdf invoices of varying formats

Original Source Link

The objective is to extract data out of invoices in pdf format.

Pdf data format:
selectable text (not scanned images) consists of lines of text, name-value pairs, tables (of varying lengths)

Invoices data includes:
invoice_no, invoice_date, order_no, order_date in name-value pairs
items details (item_code, name, rate, quantity, discount, price, etc) in table format
final_taxation_info and gross_total

Inputs:
Bulk of invoices are received weekly having both similar and distinct formats

Outputs:
Extract invoices data and insert into database

Approaches tried or considered so far:

  1. Writing a custom algorithm in C# using libraries, like iText7, PDFix, GemBox.Pdf, GroupDocs.Parser, Bytescout.PDFExtractor, Sautinsoft.pdffocus, Spire.PDF, etc.
    Downside: Have to modify or write a new algorithm for a new pdf format.
  2. Data extraction tools, like SmallPDF, Convertapi.com, cometdocs.com, groupdocs.app.
    Downside: No control over the extraction algorithm.
  3. Template guided extraction, like Pdf_Element, Tabula, Docparser, iText pdf2Data.
    Downside: Fails when the table length varies.
  4. AI/ML-based extraction, automation tools/services, like AWS Textract, UiPath, KlearStack, IQ Bot (I have not tried this last approach practically in-depth, just scratched the surface).
    Downside: Not sure but seems like learning curve or cost could be stumbling blocks.

Considering this whole scenario can anybody suggest which approach I should follow.

We used approach 1, at our org, you have to come up with
pdf->free text-> formulated exprressions to extract.
AI tools would work only if you have a large set of documents that you can “train” the AI with .

http://www.puntechsolutions.com.au/smartdt.html

Tagged : / /

Ubuntu HowTo: How do I digitally sign PDFs in 2019?

Original Source Link

This older post is either pointing to mostly dead software or the answers are not fully applicable.

I want to take a PDF document, stick in an image of my signature and have this be digitally signed using a certificate so that the document is secured and any changes will be picked up.

I’d like to open a document, navigate to the relevant signature page, click on the line or draw a box, enter a password and my signature should be drawn and certificate used to digitally sign the doc.

I’ve tried the following options and here are the problems:

  • Libre Office: Difficult to sign existing PDF’s, better to create pdf’s with. Have to add signature image separately.

  • PortableSigner: Hard to position signature but does the job

  • Master PDF Editor: Works well but takes 70 dollars to prevent ugly watermark being added to PDF’s

  • Foxit Reader: Only adds image without any certificate signing.

Any ideas?

I recommend you go through the list of OpenSC based applications. OpenSC is the base library of most applications using smartcard and USB key hardware certificates.

At first glance, the following seem interesting for your use case (though I haven’t tried them myself yet):

I use DocuSign, which is free web app (for single signatures). It also serves as a (hopefully) trusted third party.

From DocuSign – Wikipedia:

DocuSign, Inc. is an American company headquartered in San Francisco, California that allows organizations to manage electronic agreements. As part of the DocuSign Agreement Cloud, DocuSign offers eSignature, a way to sign electronically on different devices. DocuSign claims it has over 475,000 customers and hundreds of millions of users in more than 180 countries. Signatures processed by DocuSign are compliant with the US ESIGN Act. and the European Union’s eIDAS regulation, including EU Advanced and EU Qualified Signatures.

You create in LibreOffice your usual document as new document (*.doc or *.odt) – when document is created finish, then add watermark like it is described here :

https://libreofficehelp.com/how-to-add-watermark-in-libreoffice-writer/

When watermark is set finish, you then can export this document to format PDF.

Tagged : / / /

Linux HowTo: How can I view pdf meta-data in Windows Explorer?

Original Source Link

I unsuccessfully used the “pages” feature in Windows Explorer, as well as in Directory Opus 10 and Free Commander XT (which I installed just for that reason, to try it out) to display the page count of multiple PDFs in a folder.

All my PDF’s are free to edit, i.e. not write-protected. I don’t understand why any PDF reader can display the (correct) page number, but none of the file explorers can? (In the “details” view of course.)

The only documents whose page count is displayed are MS Word documents.

As you know for such information a Shell Extension Handler for PDF should be installed, but is there any?

On a side-note: Did that change in Windows 8?

Initial research: Google search was unsuccessful, the only slightly related SE topic I found was “How to count pages in multiple PDF files?“.

Windows 7 Home Professional 64b

For non-natively supported file types the Windows Shell needs shell extension handlers (sort of a plugin) to extend some of the Shell functionality to these file types.

Under Windows XP or earlier, the Shell uses the called Column Handler extension to show files metadata in the Windows Explorer details view columns.

Since Vista, the Shell uses the more versatile Property (or Metadata) Handler. This extension is also used to show and edit the files metadata on the Windows Explorer details pane and file properties details tab, and to show metadata on many other file dialogs (file delete confirmation, etc.).
It’s also required to have these files metadata indexed by the Windows Search indexer. The indexer may also use a IFilter extension to index the files text content.

The PDF file type is not natively supported by the Windows Shell (this has not changed in Windows 8.x, nor Windows 10), so you will need to install a PDF Property Handler in order to access the PDFs metadata.

I develop a commercial tool, the PDF-ShellTools, that provides the Windows Shell with a PDF Property Handler.

You can sort, filter pdf files based on title, pages etc using this shell extenion Debenu

Additionally, this portable application extracts all data from pdf files and produces a tabluar output which you can use in your workflow pdfinfogui

The free utility PDF Property Extension! by CoolSoft provides those columns, and also shows the information in the file properties dialog.

This utility appears to be quite updated as required, and supports e.g. Windows 10.

I know you’re asking about viewing pdf page count in windows explorer, but if what you’re looking for is a list of pdfs with the page number of each, Acrobat Pro does that.
1. Under the File menu select “Organizer.”
2. Go to the folder with the pdfs you’re interested in.
3. In the “Sort by” field, select “Number of Pages.”
That will display the number of pages for each pdf file. Not exactly what you want. But should do the trick.

Everyone wants to use a tool, get the page count and then export it to Excel. Why not use Excel to do the page counting and then put it in a sheet?

Look at http://www.mrexcel.com/forum/excel-questions/347911-visual-basic-applications-page-count.html

This is the code and it works fantastic:

Sub Test()
    Dim MyPath As String, MyFile As String
    Dim i As Long
    MyPath = "C:TestFolder"
    MyFile = Dir(MyPath & Application.PathSeparator & "*.pdf", vbDirectory)
    Range("A:B").ClearContents
    Range("A1") = "File Name": Range("B1") = "Pages"
    Range("A1:B1").Font.Bold = True
    i = 1
    Do While MyFile <> ""
        i = i + 1
        Cells(i, 1) = MyFile
        Cells(i, 2) = GetPageNum(MyPath & Application.PathSeparator & MyFile)
        MyFile = Dir
    Loop
    Columns("A:B").AutoFit
    MsgBox "Total of " & i - 1 & " PDF files have been found" & vbCrLf _
           & " File names and corresponding count of pages have been written on " _
           & ActiveSheet.Name, vbInformation, "Report..."
End Sub
'
Function GetPageNum(PDF_File As String)
    'Haluk 19/10/2008
    Dim FileNum As Long
    Dim strRetVal As String
    Dim RegExp
    Set RegExp = CreateObject("VBscript.RegExp")
    RegExp.Global = True
    RegExp.Pattern = "/Types*/Page[^s]"
    FileNum = FreeFile
    Open PDF_File For Binary As #FileNum
        strRetVal = Space(LOF(FileNum))
        Get #FileNum, , strRetVal
    Close #FileNum
    GetPageNum = RegExp.Execute(strRetVal).Count
End Function

I have been able to achieve the desired result with Tracker Software’s PDF-XChange Viewer. It’s a Freeware PDF viewer with some extras, including a Shell Extension and iFilter.

To register the shell extension you have to:

  1. Install the viewer
  2. Set it as default program for PDFs (this activates the shell extension)
  3. Reset your default for opening PDFs to the previous value (the program even has an option for it) Optional

Happy times!

I can see (and sort by) the PDF page count of PDF files in Windows 10 Explorer, when the “pages” column is activated. Not sure if this feature is provided by PDF-XChange Editor which I have installed.

Tagged : / / / /