PDA

View Full Version : Extract portion of web page to excel using VBA



swaggerbox
09-30-2013, 12:58 AM
In column A, I have a list of patent numbers. What I would like to do is extract the claims section of each patent from "https://www.google.com/patents/(+patent number)" to column B and count the number of characters in column C.

For example, in row 1, the value is CN103075166A. The link to the source page is therefore, https://www.google.com/patents/CN103075166A.
I would like to extract only claim 1: "1 A composite self-adhesive polymer waterproof board, comprising: a top-down order by the isolation layer (1), self-adhesive layer (2), the waterproof board layer (3), geotextile composite layer ( 4) and the lower waterproof sheet layer (5) composite; waterproof plate of the upper layer (3) and the lower waterproof sheet layer (5) are composed of the following parts by weight ratio of raw material made of: a styrene - ethylene - D diene - I ~ 10 parts styrene copolymer; ethylene - vinyl acetate copolymer 30 to 90 parts; linear low density polyethylene of 20 to 50 parts; high density polyethylene 1 to 50 parts; 1 nm to 10 parts of silica; Nano-Alumina 1 ~ 10 parts; 1 to 10 parts of titanium oxide nano; 0.01 to 3 parts of an ultraviolet absorber; antioxidant 0.5 to 1.5 parts of secondary antioxidant 0.1 to 3 parts." This is pasted in Column B (it is ok if some of the data are truncated - Excel limit).

The total word count is 145 (pasted in Column C). Anyone have ideas on how I can get started?


Column A
CN103075166A
CN103075828A
CN103075938A
CN103076094A
CN103076683A
CN103077656A
CN103077775A
CN103077796A
CN103077869A
CN103077923A
CN103078065A
CN103078076A
CN103078093A
CN103078100A
CN103078123A
CN103078269A
CN103079328A
CN103079369A