Score:0

Is IRM preventing Office365-generated PDFs from being searchable?

ir flag

We have an intranet on which users often post information as PDFs.

Many of these PDFs are non-searchable. That is, they appear to be images (text is not selectable/searchable, SharePoint search unable to index the contents). However, some are fine.

The metadata on the non-searchable PDFs shows them being saved on Windows using Office365 on company-issued hardware. I have asked some users named in the metadata, and they confirm all they are doing is saving as PDF from Word.

Unaffected (searchable) PDFs shows them being either saved on MacOS computers (typically using InDesign), or using Google Chrome (perhaps some extension?).

Is this an Information Rights Management issue?

If so, I'm not responsible for that, but I can find the person who is and ask them.

Or might it be some other issue?

EDIT:

Here's what I'm seeing, but I've not been able to test No. 2 (only been told this happens) below:

  1. Save a PDF from Word on a Windows machine not supplied by the company. The resulting PDF is searchable by default.

  2. Save a PDF from Word on a Windows machine supplied by company. The resulting PDF is not searchable and cannot be made so by the user.

  3. Save a PDF from Word on a Macintosh machine supplied by company. The resulting PDF is searchable by default (using "Best for printing" or "Best for electronic distribution (uses Microsoft online service))".

Score:0
ir flag

Further investigation indicates that no, this is not an IRM issue. But it is platform dependent.

While I have yet to find a citation to support this, Windows machines using MS Word (Office365) appear save PDFs as searchable by default if "standard" fonts (the definition of which I am unsure of) are used in the document. That is, Calibri, Arial, Times Roman, etc.

If a "non-standard" font is used which needs to be embedded, Word on Windows will produce a PDF that cannot be searched.

However, if you use MS Word on MacOS, the same "non-standard" fonts appear to be handled as searchable text in the resulting PDF.

Whether this is to do with the construction of the font itself, or some underlying Windows font-handling issue I don't know, and requires further testing.

So, while this is the answer to the original question, it does not solve the wider problem resulting in an intranet that is about 50% useless.

I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.