Generating Excel and Word Documents Efficiently Using Open XML and Helper Libraries

 
Using Open Xml and Helper Libraries
to generate Excel and Word
documents for websites
 
By Peter Messenger
Senior Developer
Triple Point Technologies (previously QMASTOR)
 
Website: 
www.petermessenger.com
 Email : stonecourier@gmail.com
Sole IT Developer : 
http://www.physiotherapyexercises.com
 
Why Generate Excel/Word
documents?
 
Portable
Commonly Used
Users are quite familiar with using it
Can be read by many other programs
 
Used at work and for my home project to
output ResX files for translation by
professional or other language speakers
 
How to generate excel files?
 
Idea 1 – used by student who developed our
translation program
 
COM Automation
“Ok” for single user desktop app
Not Thread Safe
Excel.exe not written to support concurrent users
Microsoft excel needs to be installed on the server
Very, very slow for large datasets
Not recommended by Microsoft
 
 
 
How to generate excel files?
 
Idea 2 – used in my output page on my website (2005)
 
Generate CSV or tab files and import
No compression, files can be quite large
 
Import/Export using Open OLE and Sql
Can be quite temperamental due to the way it determines the column types –
based on looking at the first “X” rows
Mixed column types (integers and text) can result in data being nulled
Varying length columns can be truncated due to it determining column types text
(255 characters max) versus long text columns types
 
using (OleDbConnection conn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\temp\\test.xls;Extended
Properties='Excel 8.0;HDR=Yes'"))
 {
 
 conn.Open();
 
 OleDbCommand cmd = new OleDbCommand("CREATE TABLE [Sheet1] ([Column1] string, [Column2] string)", conn);
 
cmd.ExecuteNonQuery();
}
 
Problems?
 
Both methods are problematic
The users can “break” the files by adding
columns, changing names or using sort badly
This results in data corruption and lots of
heart ache when uploading them
 
So it there a better way?
 
A Better Way?
 
Office 2007 introduced the Open Xml Format
Zipped xml format
Developer website 
http://openxmldeveloper.org/
Can download Open XML Productivity Tool
This can read any office file, display the xml and
generate C# script to regenerate the file
Based on this you can use files for templates and add
data or generate files from scratch
 
It all sounds good…….
 
Problems Still?
 
Documentation on how to do things isn’t very
good
Can be quite difficult, a single change can
result in the file refusing to open with very
cryptic error messages
Reading in large recordsets can be slow
 
So what then?
 
A solution?
 
Open source codeplex library
http://closedxml.codeplex.com/
http://epplus.codeplex.com/
 
I used Closed Xml as the documentation was better and it was a more
active project.
 
It provides a more structured interface and generates the xml document
for you in the background.
 
A good example
 
http://closedxml.codeplex.com/wikipage?title=Showcase&referringTitle=
Documentation
 
Benefits
 
Much easier to develop, takes out the
complexity of having to know the xml
Not brittle, doesn’t break
Loading in large datasets is lightning fast
Can still use existing files as templates
Can generate protected workbooks so users
cannot change anything you do not want
them to change
 
Demo
 
Translation page download for translators
http://www.physiotherapyexercises.com/ExerciseData_Download.aspx
 
Download file
Can only edit their language column, cannot edit
english or key column
Can still filter and sort to aid them in translation
Cannot sort in way that would break data
relationships
We then upload file, it determines changes made and
generates sql scripts to update what has changed
 
Demo 2
 
Main data for english files
http://www.physiotherapyexercises.com/ExerciseData_English.aspx
Download file
Can add, delete or modify records
Data can have validation built in, with comments
indicating what is acceptable
We then upload file, it determines changes made and
generates sql scripts to update what has changed
 
What about Word?
 
Have functionality for users to generate web,
PDF, or word documents that they can
customise. This uses the iTextSharp library.
Some users want absolute flexibility, being
able to provide their own word templates
 
Want another cheap (free) solution, something
on codeplex again?
 
Word Document Generator
 
Another open source codeplex project
http://worddocgenerator.codeplex.com/
Uses document placeholders
Had to amend the source code a bit so it could
handle custom images in the template file
 
Example for a client booklet
http://www.physiotherapyexercises.com/
 
 
Example Code
 
 DocumentGenerationInfo 
generationInfo = GetDocumentGenerationInfo("ExerciseDocumentGenerator", "1.0", datasource, templatefile, false);
 
var
 sampleDocumentGenerator = new ExerciseDocumentGenerator(generationInfo); 
-- ExerciseDocumentGenerator inherits off DocumentGenerator
byte[] 
result = sampleDocumentGenerator.GenerateDocument();
 
Response.ClearContent();
Response.Clear();
Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
Response.AddHeader("Content-Disposition", "attachment; filename=Exercises.docx;");
Response.BinaryWrite(result);
Response.Flush();
Response.End();
 
 private static DocumentGenerationInfo GetDocumentGenerationInfo(string docType, string docVersion, object dataContext, string fileName, bool
useDataBoundControls)
    {
        return new DocumentGenerationInfo
                                                    {
                                                        Metadata =
                                                            new DocumentMetadata { DocumentType = docType, DocumentVersion = docVersion },
                                                        DataContext = dataContext,
                                                        TemplateData =
                                                            File.ReadAllBytes(
                                                               HostingEnvironment.MapPath("~/" + fileName)),
                                                        IsDataBoundControls = useDataBoundControls
                                                    };
 
    }
 
 
Summary
 
Easy to integrate Word and Excel into your
websites with open xml and these libraries
Other libraries are also worth investigating
 
Comments and questions?
Slide Note
Embed
Share

"Learn about the advantages of generating Excel and Word documents, explore different methods like COM Automation and Open XML, and discover a better way to create templates and manipulate data effectively."


Uploaded on Apr 07, 2024 | 8 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Using Open Xml and Helper Libraries to generate Excel and Word documents for websites By Peter Messenger Senior Developer Triple Point Technologies (previously QMASTOR) Website: www.petermessenger.com Email : stonecourier@gmail.com Sole IT Developer : http://www.physiotherapyexercises.com

  2. Why Generate Excel/Word documents? Portable Commonly Used Users are quite familiar with using it Can be read by many other programs Used at work and for my home project to output ResX files for translation by professional or other language speakers

  3. How to generate excel files? Idea 1 used by student who developed our translation program COM Automation Ok for single user desktop app Not Thread Safe Excel.exe not written to support concurrent users Microsoft excel needs to be installed on the server Very, very slow for large datasets Not recommended by Microsoft

  4. How to generate excel files? Idea 2 used in my output page on my website (2005) Generate CSV or tab files and import No compression, files can be quite large Import/Export using Open OLE and Sql Can be quite temperamental due to the way it determines the column types based on looking at the first X rows Mixed column types (integers and text) can result in data being nulled Varying length columns can be truncated due to it determining column types text (255 characters max) versus long text columns types using (OleDbConnectionconn = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\\temp\\test.xls;Extended Properties='Excel 8.0;HDR=Yes'")) { conn.Open(); OleDbCommandcmd = new OleDbCommand("CREATE TABLE [Sheet1] ([Column1] string, [Column2] string)", conn); cmd.ExecuteNonQuery(); }

  5. Problems? Both methods are problematic The users can break the files by adding columns, changing names or using sort badly This results in data corruption and lots of heart ache when uploading them So it there a better way?

  6. A Better Way? Office 2007 introduced the Open Xml Format Zipped xml format Developer website http://openxmldeveloper.org/ Can download Open XML Productivity Tool This can read any office file, display the xml and generate C# script to regenerate the file Based on this you can use files for templates and add data or generate files from scratch It all sounds good .

  7. Problems Still? Documentation on how to do things isn t very good Can be quite difficult, a single change can result in the file refusing to open with very cryptic error messages Reading in large recordsets can be slow So what then?

  8. A solution? Open source codeplex library http://closedxml.codeplex.com/ http://epplus.codeplex.com/ I used Closed Xml as the documentation was better and it was a more active project. It provides a more structured interface and generates the xml document for you in the background. A good example http://closedxml.codeplex.com/wikipage?title=Showcase&referringTitle= Documentation

  9. Benefits Much easier to develop, takes out the complexity of having to know the xml Not brittle, doesn t break Loading in large datasets is lightning fast Can still use existing files as templates Can generate protected workbooks so users cannot change anything you do not want them to change

  10. Demo Translation page download for translators http://www.physiotherapyexercises.com/ExerciseData_Download.aspx Download file Can only edit their language column, cannot edit english or key column Can still filter and sort to aid them in translation Cannot sort in way that would break data relationships We then upload file, it determines changes made and generates sql scripts to update what has changed

  11. Demo 2 Main data for english files http://www.physiotherapyexercises.com/ExerciseData_English.aspx Download file Can add, delete or modify records Data can have validation built in, with comments indicating what is acceptable We then upload file, it determines changes made and generates sql scripts to update what has changed

  12. What about Word? Have functionality for users to generate web, PDF, or word documents that they can customise. This uses the iTextSharp library. Some users want absolute flexibility, being able to provide their own word templates Want another cheap (free) solution, something on codeplex again?

  13. Word Document Generator Another open source codeplex project http://worddocgenerator.codeplex.com/ Uses document placeholders Had to amend the source code a bit so it could handle custom images in the template file Example for a client booklet http://www.physiotherapyexercises.com/

  14. Example Code DocumentGenerationInfo generationInfo = GetDocumentGenerationInfo("ExerciseDocumentGenerator", "1.0", datasource, templatefile, false); var sampleDocumentGenerator = new ExerciseDocumentGenerator(generationInfo); -- ExerciseDocumentGenerator inherits off DocumentGenerator byte[] result = sampleDocumentGenerator.GenerateDocument(); Response.ClearContent(); Response.Clear(); Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"; Response.AddHeader("Content-Disposition", "attachment; filename=Exercises.docx;"); Response.BinaryWrite(result); Response.Flush(); Response.End(); private static DocumentGenerationInfo GetDocumentGenerationInfo(string docType, string docVersion, object dataContext, string fileName, bool useDataBoundControls) { return new DocumentGenerationInfo { Metadata = new DocumentMetadata { DocumentType = docType, DocumentVersion = docVersion }, DataContext = dataContext, TemplateData = File.ReadAllBytes( HostingEnvironment.MapPath("~/" + fileName)), IsDataBoundControls = useDataBoundControls }; }

  15. Summary Easy to integrate Word and Excel into your websites with open xml and these libraries Other libraries are also worth investigating Comments and questions?

Related


More Related Content

giItT1WQy@!-/#giItT1WQy@!-/#