Solved

Microsoft Sharepoint

  • 6 August 2013
  • 8 replies
  • 43 views

Badge
Is there any way that a FME workbench could be set up to take several thousand documents from a directory folder and load them into MS Sharepoint?
icon

Best answer by davideagle 18 February 2016, 17:06

View original

8 replies

Badge +14

I've also seen this requirement very recently where a customer is planning a data migration from one document management system to Sharepoint. At the moment we are looking at what options there are to use FME to do this via the Sharepoint Web Service. There are 4 million documents to migrate, which is far too much for CTRL-X and CTRL-V!

Has anyone done this via FME?

Badge +5

I've also seen this requirement very recently where a customer is planning a data migration from one document management system to Sharepoint. At the moment we are looking at what options there are to use FME to do this via the Sharepoint Web Service. There are 4 million documents to migrate, which is far too much for CTRL-X and CTRL-V!

Has anyone done this via FME?

Would a batch file in windows be easier to copy all the data up?

http://social.technet.microsoft.com/wiki/contents/articles/4437.upload-files-from-a-local-machine-to-a-sharepoint-document-library-automatically-dos-command.aspx

Userlevel 4
Badge +26

I'm not sure how a Sharepoint load works but, in general, to move/copy files use the Directory and File Pathnames reader (to get a list of files) and the File Copy writer (to move them).

Badge +14

Thanks for the suggestions all. This kind of file copy is not quite as simple as a standard directory file copy @Mark2AtSafe. A batch approach might be possible be we also need to do deduplication and update other file metadata, so want to try to get FME to do the job @marko. We have however got a proof of concept up and running with a lot of trial and error using the XMLTemplater and the SoapSender... so fingers crossed.

Badge +4

Thanks for the suggestions all. This kind of file copy is not quite as simple as a standard directory file copy @Mark2AtSafe. A batch approach might be possible be we also need to do deduplication and update other file metadata, so want to try to get FME to do the job @marko. We have however got a proof of concept up and running with a lot of trial and error using the XMLTemplater and the SoapSender... so fingers crossed.

 

Hi @1spatialdave,

 

 

I'm looking to do the same thing. Are you able to share your workbench?

 

 

or describe your solution a little more detail?

 

 

I'm currently stuck with SP2010, so I think will need to use the copy.asmx web service somehow.

 

 

Nick
Badge +3

Hi Nick

For a quick reference, you can upload documents by sending a SOAP message to the CopyIntoItem method of copy.asmx.

I build up the SOAP message using XMLTemplater.

An example ROOT Template:
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
 <s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <CopyIntoItems xmlns="http://schemas.microsoft.com/sharepoint/soap/">
      <SourceUrl>nothing</SourceUrl>
      <DestinationUrls>
        <string>$(SP_SITE){fme:get-attribute("SP_list_root_folder")}{fme:get-attribute("SP_path")}</string>
        <string>$(SP_SITE){fme:get-attribute("SP_list_root_folder")}{fme:get-attribute("SP_path2")}</string>
      </DestinationUrls>
      <Fields>
        <FieldInformation Type="Text" DisplayName="Surname" Value="Smith" />
        <FieldInformation Type="DateTime" DisplayName="Date of Birth" Value="1965-11-03" />
      </Fields>
      <Stream>{fme:get-attribute("document_base64_encoded")}</Stream>
    </CopyIntoItems>
  </s:Body>
</s:Envelope>

$(SP_SITE) is an example for workspace parameter and {fme:get-attribute("SP_list_root_folder")} for an attribute.

There are 4 important bits:

  1. <DestinationUrls>

     

    You need the full protocol://server/list_root_folder/SP_path/document.ext URL where you want the document to upload.

     

    The folder must exist on SharePoint

     

    You can upload the document to multiple destinations.

     

    You can upload the document to the same server where the copy.asmx rwsides only.
  2. <Stream>
    base64 encoded content of the document. I could upload documents up to 40 MB. It is worth to restrict the size because large documents can make the workspace stall.
  3. <Fields>

     

    Optionally you can set metadata.

     

    The Type and DisplayName should match. (There is another request that reveals what metadata exist in the list.)

     

    DateTime values should be in ISO format.

  4. The contents of <SourceUrl> is indifferent but cannot be empty (I don't know why)
Set Write XML Header to Yes in the XMLTemplater.

 


You can use the HTTPCaller transformer to send the message to the web service:
Request URL: protocol://yoursharepointserver/_vti_bin/copy.asmx
HTTP Method: POST

You need to add a header (SOAP method to call) in the Headers section :
Name: SOAPAction
Value: http://schemas.microsoft.com/sharepoint/soap/CopyIntoItems

In Body section:
Upload Data: Specify Upload Body
Upload Body: set the Result Attribute of the XMLTemplater
Content Type: text/xml 
Base64 encoding the document content is quite easy:
  1. AttributeFileReader transformer: set the Source File Character Encoding to Binary (FME-binary)
  2. BinaryEncoder transformer: Encoding Type: Base64
If you upload large number of documents, it is worth to delete file content, base64 encoded and SOAP request attributes after HTTPCaller to avoid eating up memory.
I created a CSV with the document source file path, SharePoint URL and optionally metadata columns. that drive the upload.
In theory the workspace is straightforward: CSV reader -> AttributeFileReader -> BinaryEncoder -> XMLTemplater -> HTTPCaller -> AttributeRemover -> Inspector
There are a few possible problems: missing or unreadable source file, large file, invalid characters in filename (SharePoint is more restrictive than tile system), ...
Another thing, proper attribute display names and types:
Send a request to

Request URL: protocol://yoursharepointserver/_vti_bin/lists.asmx


Headers:
Name: SOAPAction
Value: http://schemas.microsoft.com/sharepoint/soap/GetList

 


Body:
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/">
    <Body>
        <GetList xmlns="http://schemas.microsoft.com/sharepoint/soap/">
            <listName>$(SP_LIST_NAME)</listName>
        </GetList>
    </Body>
</Envelope>

The XML response will contain Type, DisplayName, .. of all attributes.
My workspaces are a bit specific and big but if there is any interest I can get rid of the unnecessary bits and upload.
Hope it helps.
Badge +4

Hi Nick

For a quick reference, you can upload documents by sending a SOAP message to the CopyIntoItem method of copy.asmx.

I build up the SOAP message using XMLTemplater.

An example ROOT Template:
<s:Envelope xmlns:s="http://schemas.xmlsoap.org/soap/envelope/">
 <s:Body xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <CopyIntoItems xmlns="http://schemas.microsoft.com/sharepoint/soap/">
      <SourceUrl>nothing</SourceUrl>
      <DestinationUrls>
        <string>$(SP_SITE){fme:get-attribute("SP_list_root_folder")}{fme:get-attribute("SP_path")}</string>
        <string>$(SP_SITE){fme:get-attribute("SP_list_root_folder")}{fme:get-attribute("SP_path2")}</string>
      </DestinationUrls>
      <Fields>
        <FieldInformation Type="Text" DisplayName="Surname" Value="Smith" />
        <FieldInformation Type="DateTime" DisplayName="Date of Birth" Value="1965-11-03" />
      </Fields>
      <Stream>{fme:get-attribute("document_base64_encoded")}</Stream>
    </CopyIntoItems>
  </s:Body>
</s:Envelope>

$(SP_SITE) is an example for workspace parameter and {fme:get-attribute("SP_list_root_folder")} for an attribute.

There are 4 important bits:

  1. <DestinationUrls>

     

    You need the full protocol://server/list_root_folder/SP_path/document.ext URL where you want the document to upload.

     

    The folder must exist on SharePoint

     

    You can upload the document to multiple destinations.

     

    You can upload the document to the same server where the copy.asmx rwsides only.
  2. <Stream>
    base64 encoded content of the document. I could upload documents up to 40 MB. It is worth to restrict the size because large documents can make the workspace stall.
  3. <Fields>

     

    Optionally you can set metadata.

     

    The Type and DisplayName should match. (There is another request that reveals what metadata exist in the list.)

     

    DateTime values should be in ISO format.

  4. The contents of <SourceUrl> is indifferent but cannot be empty (I don't know why)
Set Write XML Header to Yes in the XMLTemplater.

 


You can use the HTTPCaller transformer to send the message to the web service:
Request URL: protocol://yoursharepointserver/_vti_bin/copy.asmx
HTTP Method: POST

You need to add a header (SOAP method to call) in the Headers section :
Name: SOAPAction
Value: http://schemas.microsoft.com/sharepoint/soap/CopyIntoItems

In Body section:
Upload Data: Specify Upload Body
Upload Body: set the Result Attribute of the XMLTemplater
Content Type: text/xml 
Base64 encoding the document content is quite easy:
  1. AttributeFileReader transformer: set the Source File Character Encoding to Binary (FME-binary)
  2. BinaryEncoder transformer: Encoding Type: Base64
If you upload large number of documents, it is worth to delete file content, base64 encoded and SOAP request attributes after HTTPCaller to avoid eating up memory.
I created a CSV with the document source file path, SharePoint URL and optionally metadata columns. that drive the upload.
In theory the workspace is straightforward: CSV reader -> AttributeFileReader -> BinaryEncoder -> XMLTemplater -> HTTPCaller -> AttributeRemover -> Inspector
There are a few possible problems: missing or unreadable source file, large file, invalid characters in filename (SharePoint is more restrictive than tile system), ...
Another thing, proper attribute display names and types:
Send a request to

Request URL: protocol://yoursharepointserver/_vti_bin/lists.asmx


Headers:
Name: SOAPAction
Value: http://schemas.microsoft.com/sharepoint/soap/GetList

 


Body:
<Envelope xmlns="http://schemas.xmlsoap.org/soap/envelope/">
    <Body>
        <GetList xmlns="http://schemas.microsoft.com/sharepoint/soap/">
            <listName>$(SP_LIST_NAME)</listName>
        </GetList>
    </Body>
</Envelope>

The XML response will contain Type, DisplayName, .. of all attributes.
My workspaces are a bit specific and big but if there is any interest I can get rid of the unnecessary bits and upload.
Hope it helps.
@revesz fantastic posting, thankyou very much.  Works perfectly. Deserves it's own question...

 

 

I had most of it after discovering the wsdl endpoint, however i didn't have the SOAPAction header.

 

 

I do have a small bug where if i try to write a feature  to a logger that includes my endocded doc, fme just stalls. remove to loggers, and it runs smoothly, very odd. I'm working in fme 2015.0, as i'm looking to deploy to that version of server, I suspect it'll be a bug that's fixed in 2016.

 

 

many thanks.

 

 

Nick

 

Badge +14

Just dropping an additional contribution to this thread should anyone find it. We're expecting a Sharepoint 'package' to land in the next few weeks from the good people at Safe that will include the ability to access formats supported by FME directly on Sharepoint via a web connection as well as a dedicated SharepointConnector Transformer. I believe there will be many very happy people 😀 .

Reply