How to Convert Word Document to HTML with C#/VB.NET

HTML Introduction

HTML (Hypertext Markup Language) is one kind of language to describe web documents. It marks every part which will be displayed in website through mark symbol. So, a HTML file can be taken as a web file.

Actually, web file is one kind of TXT file. People add mark symbols to tell browser how to display contents, for example, where images put, which style text has and so on. But different effect will be displayed on different browser because of different explanation of one mark symbol.

HTML is called hypertext markup language because there is “Hyperlink” point included in text. What is hyperlink? It is one kind of URL pointer. Through clicking it, people can get new web page in browser. And that is one of the most important reasons that HTML is widely used.

Convert Word to HTML

How to get a HTML file? We can edit any text which can generate TXT source file to have HTML file. The other method which is frequently used is convert other files to HTML. In this post, I will share a method to convert Word to HTML with C#/VB.NET. With this method, just 3 steps, you can get a converted HTML file. 

Also, I use a .NET Word component, Spire.Doc for .NET in this example. So, I add its dll file as reference in your project.

Detailed Steps Shown as Following:          

I. Load Word document which I want to convert to HTML from my computer by document.LoadFromFile() method. The document I prepare to convert is a .docx document.

C#

            Document document = new Document();
            document.LoadFromFile(@”E:\work\Antarctic.docx”);

VB.NET

            Dim document As New Document()
            document.LoadFromFile(“E:\work\Antarctic.docx”)

II. Convert word to html by using document.SaveToFile() method. Also, string file name and file format which should be set as .html should be passed to this method.

C#

            document.SaveToFile(“ToHTML.html”, FileFormat.Html);

VB.NET

            document.SaveToFile(“ToHTML.html”, FileFormat.Html)

III. Launch this document and run. Then, I can get the html file converted from Word.

C#

            System.Diagnostics.Process.Start(“ToHTML.html”);

VB.NET

            System.Diagnostics.Process.Start(“ToHTML.html”)

Note: this method also can convert DOC document to HTML.

Result Shown as Following:

Freely Download Spire.Doc for .NET

Related Posts:

Convert Word to XML

Convert Word to Image

Convert Word to PDF

Advertisements

2 thoughts on “How to Convert Word Document to HTML with C#/VB.NET

  1. 3lb pet shape says:

    Fantastic goods from you, man. I have understand your stuff previous to and you are just extremely wonderful.
    I actually like what you’ve acquired here, certainly like what you’re stating and the way in which you say it.
    You make it entertaining and you still care for to
    keep it sensible. I can’t wait to read far more from you. This is really a tremendous web site.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s