Although not a beginner, I’m having difficulty cleanly scraping tables that don’t have targetted classes/id’s. Also, the data is in weird spots and sometimes does even exist.
This is my current project: http://a810-bisweb.nyc.gov/bisweb/OverviewForComplaintServlet?complaintno=3732304
and I need to output it in a array of key/value pairs.
Just using the browsers console:
let tables = document.querySelectorAll('table') // get all tables
tables.forEach(table => document.querySelectorAll('tr').forEach(tr => console.log(tr.innerText)))
I get this ugly text which I have to parse through and there are so many variables to consider the file turning into a 500 line monster.
Any ideas for a better way to do this?
EDIT: Additional question
This is particular hard to parse. How do you get just the first number here? Splitting by split('&bnsp;')
does not work too well.
175038 837110Gratitude for building this send! I in reality comprehend the no cost information. 477419
737291 13772I havent checked in here for some time because I thought it was getting boring, but the last few posts are actually excellent quality so I guess Ill add you back to my day-to-day bloglist. You deserve it my friend. insurance guides 447337