且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

问题阅读RSS用C#和.NET 3.5

更新时间:2023-11-27 19:33:58

RSS 2.0格式的联合供稿利用的 pubdate的元素时/协议/ RFC822 /#Z28> RFC 822日期时间规范>和 lastBuildDate 的。在RFC 822的日期时间规范是不幸表达日期时间的时区组成一个非常'灵活'的语法。



时区,可以在显示几种方法。 UT是通用时间(以前称为格林威治标准时间); GMT被允许作为世界时的参考。军方标准采用为每个区域的单个字符。 Z是通用时间。 A表示早一个小时,而M表示12小时较早; N是一小时后,和Y为12小时后。的字母J不被使用。其它剩余两种形式是从ANSI标准X3.51-1975服用。一个允许从UT偏移量的明确指示;另一个使用常见的3个字符的字符串,用于指示在北美时区。



我认为这个问题涉及如何在在RFC 822日期时间值的分量正在处理中。饲料格式似乎不处理,利用一个本地差分来表示时区日期时间。



由于RFC 1123扩展的RFC 822规范,你可以尝试使用 DateTimeFormatInfo.RFC1123Pattern (R )来处理转换problamatic日期时间,或写RFC 822格式的日期自己的解析代码。另一种选择是使用一个第三方框架代替System.ServiceModel.Syndication命名空间的类



似乎有一些known~~V问题有日期时间解析和是在微软正在解决过程中的Rss20FeedFormatter。


I have been attempting to write some routines to read RSS and ATOM feeds using the new routines available in System.ServiceModel.Syndication, but unfortunately the Rss20FeedFormatter bombs out on about half the feeds I try with the following exception:

An error was encountered when parsing a DateTime value in the XML.

This seems to occur whenever the RSS feed expresses the publish date in the following format:

Thu, 16 Oct 08 14:23:26 -0700

If the feed expresses the publish date as GMT, things go fine:

Thu, 16 Oct 08 21:23:26 GMT

If there's some way to work around this with XMLReaderSettings, I have not found it. Can anyone assist?

RSS 2.0 formatted syndication feeds utilize the RFC 822 date-time specification when serializing elements like pubDate and lastBuildDate. The RFC 822 date-time specification is unfortunately a very 'flexible' syntax for expressing the time-zone component of a DateTime.

Time zone may be indicated in several ways. "UT" is Universal Time (formerly called "Greenwich Mean Time"); "GMT" is permitted as a reference to Universal Time. The military standard uses a single character for each zone. "Z" is Universal Time. "A" indicates one hour earlier, and "M" indicates 12 hours earlier; "N" is one hour later, and "Y" is 12 hours later. The letter "J" is not used. The other remaining two forms are taken from ANSI standard X3.51-1975. One allows explicit indication of the amount of offset from UT; the other uses common 3-character strings for indicating time zones in North America.

I believe the issue involves how the zone component of the RFC 822 date-time value is being processed. The feed formatter appears to not be handling date-times that utilize a local differential to indicate the time zone.

As RFC 1123 extends the RFC 822 specification, you could try using the DateTimeFormatInfo.RFC1123Pattern ("r") to handle converting problamatic date-times, or write your own parsing code for RFC 822 formatted dates. Another option would be to use a third party framework instead of the System.ServiceModel.Syndication namespace classes.

It appears there are some known issues with date-time parsing and the Rss20FeedFormatter that are in the process of being addressed by Microsoft.