PDA

View Full Version : [SOLVED:] Corrupted word doc - document.xml corrupted



esmond
11-10-2015, 02:40 PM
Word document containing full semester uni work of my friend. She had NO backups and file was kept on USB key (hard to believe but true).
I have tried a number of recovery tools including onlinerecovery.com, DataNumen, Corrupt Docx etc. They all fail.
I tried renaming as .zip and extracting the document.xml - no luck.
tried hex editor to see if I could see any text - nothing.
I'm not interested in recovering the images in the file (got those ok). Just need the text if possible. (or point me to tools which might work?)
Thank you very much.

Paul_Hossler
11-10-2015, 07:20 PM
https://www.piriform.com/recuva

I've had good luck with Recuva free version to see what files that have been deleted are still recoverable

If you can run it on the USB, you might find an earlier or temporary version that was deleted, but still lying around (assuming she hasn't written over it)

Sounds like if a hex editor doesn't see any text in the file, then there's no text in the file

How big is/was the file?

felton198511
11-11-2015, 08:35 PM
In that case, send me the corrupt word doc. My email is slchcw(at)yahoo.com . I can help to analyze and repair the file for you manually, for free.

brunoo
11-12-2015, 11:08 AM
The DOCX file format is just a collection of XML based layout files and other files like images packaged into one using standard "zip" compression.
The Word document does not open in Word and only shows a generic error on my Win7 SP1 PC with Office 2003 plus the Office 2007-2010-2013 file format converter installed.
Here's the details of the web page error I get when I view the extracted "document.xml" in the "XML Editor" that is part of MS Office and opens it with color-coded tagging and properly indented lines within Internet Explorer.
Message: An invalid character was found inside an entity reference.
Line: 2
Char: 1584323
Code: 0
URI: file:///C:/Documents and Settings/Bill/My Documents/Downloads/Reflective-Journal-Submission-/word/document.xml
The XML page cannot be displayed
Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later.
When I scroll to the end I see this where it is unable to display any more:
An invalid character was found inside an entity reference. Error processing resource 'file:///C:/Documents and Settings/Bil...
<w:b/><w:bCs/><w:sz w:val="52"/><w:szCs w:val="52"/></w:rPr><w:tab/></w:r><w:r ...

ashleyrobbin
11-15-2015, 05:43 AM
I guess even this post will help you on this authoritative source. https://community.office365.com/en-us/f/155/t/255386

esmond
11-15-2015, 11:17 AM
Thank you guys for your feedback and help. Sorry, that long time did not respond. This is because the issue was solved and I forgot to tell.) Thank you!