Accurate PDF to DOC extraction