Generating Excel and Word Documents Efficiently Using Open XML and Helper Libraries
"Learn about the advantages of generating Excel and Word documents, explore different methods like COM Automation and Open XML, and discover a better way to create templates and manipulate data effectively."
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
E N D
Presentation Transcript
Using Open Xml and Helper Libraries to generate Excel and Word documents for websites By Peter Messenger Senior Developer Triple Point Technologies (previously QMASTOR) Website: www.petermessenger.com Email : stonecourier@gmail.com Sole IT Developer : http://www.physiotherapyexercises.com
Why Generate Excel/Word documents? Portable Commonly Used Users are quite familiar with using it Can be read by many other programs Used at work and for my home project to output ResX files for translation by professional or other language speakers
How to generate excel files? Idea 1 used by student who developed our translation program COM Automation Ok for single user desktop app Not Thread Safe Excel.exe not written to support concurrent users Microsoft excel needs to be installed on the server Very, very slow for large datasets Not recommended by Microsoft
How to generate excel files? Idea 2 used in my output page on my website (2005) Generate CSV or tab files and import No compression, files can be quite large Import/Export using Open OLE and Sql Can be quite temperamental due to the way it determines the column types based on looking at the first X rows Mixed column types (integers and text) can result in data being nulled Varying length columns can be truncated due to it determining column types text (255 characters max) versus long text columns types using (OleDbConnectionconn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\temp\\test.xls;Extended Properties='Excel 8.0;HDR=Yes'")) { conn.Open(); OleDbCommandcmd = new OleDbCommand("CREATE TABLE [Sheet1] ([Column1] string, [Column2] string)", conn); cmd.ExecuteNonQuery(); }
Problems? Both methods are problematic The users can break the files by adding columns, changing names or using sort badly This results in data corruption and lots of heart ache when uploading them So it there a better way?
A Better Way? Office 2007 introduced the Open Xml Format Zipped xml format Developer website http://openxmldeveloper.org/ Can download Open XML Productivity Tool This can read any office file, display the xml and generate C# script to regenerate the file Based on this you can use files for templates and add data or generate files from scratch It all sounds good .
Problems Still? Documentation on how to do things isn t very good Can be quite difficult, a single change can result in the file refusing to open with very cryptic error messages Reading in large recordsets can be slow So what then?
A solution? Open source codeplex library http://closedxml.codeplex.com/ http://epplus.codeplex.com/ I used Closed Xml as the documentation was better and it was a more active project. It provides a more structured interface and generates the xml document for you in the background. A good example http://closedxml.codeplex.com/wikipage?title=Showcase&referringTitle= Documentation
Benefits Much easier to develop, takes out the complexity of having to know the xml Not brittle, doesn t break Loading in large datasets is lightning fast Can still use existing files as templates Can generate protected workbooks so users cannot change anything you do not want them to change
Demo Translation page download for translators http://www.physiotherapyexercises.com/ExerciseData_Download.aspx Download file Can only edit their language column, cannot edit english or key column Can still filter and sort to aid them in translation Cannot sort in way that would break data relationships We then upload file, it determines changes made and generates sql scripts to update what has changed
Demo 2 Main data for english files http://www.physiotherapyexercises.com/ExerciseData_English.aspx Download file Can add, delete or modify records Data can have validation built in, with comments indicating what is acceptable We then upload file, it determines changes made and generates sql scripts to update what has changed
What about Word? Have functionality for users to generate web, PDF, or word documents that they can customise. This uses the iTextSharp library. Some users want absolute flexibility, being able to provide their own word templates Want another cheap (free) solution, something on codeplex again?
Word Document Generator Another open source codeplex project http://worddocgenerator.codeplex.com/ Uses document placeholders Had to amend the source code a bit so it could handle custom images in the template file Example for a client booklet http://www.physiotherapyexercises.com/
Example Code DocumentGenerationInfo generationInfo = GetDocumentGenerationInfo("ExerciseDocumentGenerator", "1.0", datasource, templatefile, false); var sampleDocumentGenerator = new ExerciseDocumentGenerator(generationInfo); -- ExerciseDocumentGenerator inherits off DocumentGenerator byte[] result = sampleDocumentGenerator.GenerateDocument(); Response.ClearContent(); Response.Clear(); Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"; Response.AddHeader("Content-Disposition", "attachment; filename=Exercises.docx;"); Response.BinaryWrite(result); Response.Flush(); Response.End(); private static DocumentGenerationInfo GetDocumentGenerationInfo(string docType, string docVersion, object dataContext, string fileName, bool useDataBoundControls) { return new DocumentGenerationInfo { Metadata = new DocumentMetadata { DocumentType = docType, DocumentVersion = docVersion }, DataContext = dataContext, TemplateData = File.ReadAllBytes( HostingEnvironment.MapPath("~/" + fileName)), IsDataBoundControls = useDataBoundControls }; }
Summary Easy to integrate Word and Excel into your websites with open xml and these libraries Other libraries are also worth investigating Comments and questions?