I am trying to consume an RSS feed using Puppeteer and output a new feed, but so far every the value of every element in the new feed says "undefined." At first I thought it was due to the elements not having any attributes, but it seems that using querySelectorAll is also undefined when it should be grabbing the content of every item element in the feed. This is how I am querying the original feed
await page.goto(url, {waitUntil: 'networkidle2'});
let rssitems = await page.evaluate(() => {
let results;
let items = document.querySelectorAll('item');
items.forEach((item) => {
results += '<title>' + item.querySelector('title').innerText + '</title>';
results += '<description>' + item.querySelector('description').innerText + '</description>';
results += '<link>' + item.querySelector('link').innerText + '</link>';
results += '<guid>' + item.querySelector('guid').innerText + '</guid>';
results += '<pubDate>' + item.querySelector('pubDate').innerText + '</pubDate>';
});
return results;
});
If I understand correctly, you are trying to interact with RSS document. RSS is XML, not HTML, so you need API for Node classes, not HTMLElement classes. So instead of
HTMLElement.innerText
, you can tryNode.textContent
.