提问人:rolling_codes 提问时间:4/23/2023 最后编辑:rolling_codes 更新时间:5/21/2023 访问量:186
在 javascript 中用时区解析任何日期
Parse any date with timezone in javascript
问:
我一直在寻找现有的解决方案,但还没有找到可以处理此类方案的方法:我希望能够解析可能格式错误的日期。这些日期是从全球各地的网页上抓取的,不幸的是,其中只有一半足以提供格式正确的 ISO 字符串。
对于以下示例,我唯一的选择是使用正则表达式解析输入,然后重建日期吗?
// invalid date
new Date('Apr 21, 2023,06:51 pm EDT');
// invalid date
new Date('Apr 21, 2023 06:51pm EDT');
// invalid date
new Date('Apr 21, 2023 06:51 p.m EDT');
// invalid date
new Date('Apr 21, 2023 06.51 pm EDT');
// invalid date
new Date('Apr 21, 2023 06:51 pm ET');
// invalid date
new Date('06:51 pm Apr 21, 2023 EDT');
// invalid date
new Date('Apr 21st 2023 EDT');
// invalid date
new Date('7th Feb 2023');
// invalid date
new Date('3 hours ago');
// invalid date
new Date('10m ago');
// invalid date (haven't solved this one yet)
new Date('Thursday 10:30PM EST');
我目前的解决方案是首先检查字符串中的日期数字 + 可选时区,然后是时间数字 + 可选时区,最后以 ISO 格式重建日期字符串。对于似乎非常普遍的要求,这似乎是很多样板。有没有更简单的方法,可以做什么?
import ms from 'ms';
export const TIME_EXPR = /(\d\d?)\s*(a\.?m\.?|p\.?m\.?)|(\d\d?)[.:](\d\d?)(?:[.:](\d\d?))?(?:\s*(a\.?m\.?|p\.?m\.?))?(?:.*(?:\s*(ACDT|ACS?T|AES?T|AKDT|AKS?T|BS?T|CES?T|CDT|CS?T|EDT|ES?T|IS?T|JS?T|MDT|MSK|NZS?T|PDT|PS?T|UTC)))?/i;
export const DATE_EXPR = /(\d\d?\s*(?:h|h(?:ou)?rs?|m|min(?:ute)s?))\s*ago|(?:(\d\d?)([-./])(\d\d?)\3(\d{4}|\d{2})|(\d{4})([-./])(\d\d?)\7(\d\d?)|(?:(\d\d?)(?:st|nd|rd|th)?\s*)?(Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:t(?:ember)?)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)(?:\s+(\d\d?)(?:st|nd|rd|th)?)?(?:[,\s]\s*(\d{4})?))(?:.*(?:\s*(ACDT|ACS?T|AES?T|AKDT|AKS?T|BS?T|CES?T|CDT|CS?T|EDT|ES?T|IS?T|JS?T|MDT|MSK|NZS?T|PDT|PS?T|UTC)))?|(\d\d\d\d+)/i;
export function monthToString(month: number | string) {
const m = parseInt(`${month}`);
if (Number.isNaN(m)) {
return month;
}
return new Date(`2023-${m}-07`).toLocaleString('en-US', { month: 'long' });
}
export function parseDate(context?: string) {
if (!context) {
return new Date('invalid');
}
const date = new Date(context.trim());
if (!Number.isNaN(date.valueOf()) && date.valueOf() > 0) {
return date;
}
const dateMatches = context.match(DATE_EXPR);
const timeMatches = context.match(TIME_EXPR);
let year = String(new Date().getFullYear());
let month = monthToString(new Date().getMonth() + 1);
let day = String(new Date().getDate());
let hour = 0, min = 0, sec = 0, amOrPm = '';
let timezone = '';
if (!dateMatches) {
return new Date('invalid');
}
const [_0, relative, month1, _3, day1, year1, year2, _7, month2, day2, day3, month3, day4, year3, tz = '', timestamp] = dateMatches;
if (relative) {
return new Date(Date.now() - ms(relative.replace(/h(?:ou)?rs?/, 'h').replace(/m(?:in)?s?/, 'm')));
}
const datetime = parseInt(timestamp);
if (!Number.isNaN(datetime)) {
if (!Number.isNaN(new Date(datetime)).valueOf()) {
return new Date(datetime);
}
}
timezone = tz;
year = year1 ?? year2 ?? year3 ?? String(new Date().getFullYear());
month = monthToString(month1 ?? month2 ?? month3 ?? new Date().getMonth() + 1);
day = day1 ?? day2 ?? day3 ?? day4 ?? String(new Date().getDate());
if (timeMatches) {
const [_0, hour1, amOrPm1, hour2, min1, sec1, amOrPm2, timezone2] = timeMatches;
hour = !Number.isNaN(parseInt(hour1 ?? hour2)) ? parseInt(hour1 ?? hour2) : 0;
min = !Number.isNaN(parseInt(min1)) ? parseInt(min1) : 0;
sec = !Number.isNaN(parseInt(sec1)) ? parseInt(sec1) : 0;
amOrPm = (amOrPm1 ?? amOrPm2 ?? '').replace(/\./g, '');
if (timezone2) {
timezone = timezone2;
}
}
const dateMatch = [`${month} ${day}, ${String(year).length === 2 ? `20${year}` : year} ${hour}:${min}:${sec} ${amOrPm ? amOrPm : (hour < 12) ? 'am' : 'pm'}`, timezone.replace(/^(A[CEK]|CE|NZ|[BCEIJP])T$/, ($0, $1) => `${$1}ST`)].join(' ');
const parsedDate = new Date(dateMatch);
return parsedDate;
}
export function sortDates(...dates: Date[]) {
return [...dates].filter((d) => !Number.isNaN(d.valueOf())).sort((a, b) => {
return a.valueOf() - b.valueOf();
});
}
export function minDate(...dates: Date[]) {
const sortedDates = sortDates(...dates);
if (sortedDates.length === 0) {
return new Date('invalid');
}
return sortedDates[0];
}
export function maxDate(...dates: Date[]) {
const sortedDates = sortDates(...dates);
if (sortedDates.length === 0) {
return new Date('invalid');
}
return sortedDates[sortedDates.length - 1];
}
答:
-1赞
Dimava
4/23/2023
#1
你试过 https://www.npmjs.com/package/any-date-parser 吗?
解析各种日期格式,包括人工输入的日期。
any-date-parser 有一个 addFormat() 函数来添加自定义解析器。
支持的格式 24 小时时间 12 小时时间
时区偏移
时区缩写 年月日年名称 月日月日年月日年月日
年
月日
评论
0赞
RobG
4/23/2023
这应该是一个评论,而不是一个答案。
0赞
trincot
4/23/2023
#2
最好的方法是在源中以标准方式(ECMAScript 的日期时间字符串格式)格式化日期。如果不可能,那么你确实需要解析输入(或者让一个库为你做这件事)。
我有一个超级复杂的正则表达式......
也许你可以分步建立它,这样它就保持可管理性?或者,如果您决定让正则表达式执行数值验证,则可以删除这些验证,并将其留给 Date 构造函数处理。
以下是具有以下特征的可能实现:
- 不注重验证;它慷慨地匹配超出范围的数字。
- 允许组件之间的任何标点符号(任何匹配的标点符号)
\W+
) - 使用查找对象将已知时区代码映射到时区偏移量
- 生成一个符合 ECMAScript 的日期时间字符串格式的字符串,条件是输入组件有效(在范围内)。
- 让调用方将该日期时间字符串传递给 date 构造函数或传递给 。
Date.parse
const zones = {aoe:'-12:00',y:'-12:00',nut:'-11:00',sst:'-11:00',x:'-11:00',ckt:'-10:00',hst:'-10:00',taht:'-10:00',w:'-10:00',mart:'-09:30',akst:'-09:00',gamt:'-09:00',hdt:'-09:00',v:'-09:00',akdt:'-08:00',pst:'-08:00',pst:'-08:00',u:'-08:00',mst:'-07:00',pdt:'-07:00',t:'-07:00',cst:'-06:00',east:'-06:00',galt:'-06:00',mdt:'-06:00',s:'-06:00',act:'-05:00',cdt:'-05:00',cist:'-05:00',cot:'-05:00',cst:'-05:00',easst:'-05:00',ect:'-05:00',est:'-05:00',pet:'-05:00',r:'-05:00',amt:'-04:00',ast:'-04:00',bot:'-04:00',cdt:'-04:00',cidst:'-04:00',clt:'-04:00',edt:'-04:00',fkt:'-04:00',gyt:'-04:00',pyt:'-04:00',q:'-04:00',vet:'-04:00',nst:'-03:30',adt:'-03:00',amst:'-03:00',art:'-03:00',brt:'-03:00',clst:'-03:00',fkst:'-03:00',gft:'-03:00',p:'-03:00',pmst:'-03:00',pyst:'-03:00',rott:'-03:00',srt:'-03:00',uyt:'-03:00',warst:'-03:00',wgt:'-03:00',ndt:'-02:30',brst:'-02:00',fnt:'-02:00',gst:'-02:00',o:'-02:00',pmdt:'-02:00',uyst:'-02:00',wgst:'-02:00',azot:'-01:00',cvt:'-01:00',egt:'-01:00',n:'-01:00',
utc:'+00:00',azost:'+00:00',egst:'+00:00',gmt:'+00:00',wet:'+00:00',wt:'+00:00',z:'+00:00',a:'+01:00',bst:'+01:00',cet:'+01:00',ist:'+01:00',wat:'+01:00',west:'+01:00',wst:'+01:00',b:'+02:00',cat:'+02:00',cest:'+02:00',eet:'+02:00',ist:'+02:00',sast:'+02:00',wast:'+02:00',ast:'+03:00',c:'+03:00',eat:'+03:00',eest:'+03:00',fet:'+03:00',idt:'+03:00',msk:'+03:00',syot:'+03:00',trt:'+03:00',irst:'+03:30',adt:'+04:00',amt:'+04:00',azt:'+04:00',d:'+04:00',get:'+04:00',gst:'+04:00',kuyt:'+04:00',msd:'+04:00',mut:'+04:00',ret:'+04:00',samt:'+04:00',sct:'+04:00',aft:'+04:30',irdt:'+04:30',amst:'+05:00',aqtt:'+05:00',azst:'+05:00',e:'+05:00',mawt:'+05:00',mvt:'+05:00',orat:'+05:00',pkt:'+05:00',tft:'+05:00',tjt:'+05:00',tmt:'+05:00',uzt:'+05:00',yekt:'+05:00',ist:'+05:30',npt:'+05:45',
almt:'+06:00',bst:'+06:00',btt:'+06:00',f:'+06:00',iot:'+06:00',kgt:'+06:00',omst:'+06:00',qyzt:'+06:00',vost:'+06:00',yekst:'+06:00',cct:'+06:30',mmt:'+06:30',cxt:'+07:00',davt:'+07:00',g:'+07:00',hovt:'+07:00',ict:'+07:00',krat:'+07:00',novst:'+07:00',novt:'+07:00',omsst:'+07:00',wib:'+07:00',awst:'+08:00',bnt:'+08:00',cast:'+08:00',chot:'+08:00',cst:'+08:00',h:'+08:00',hkt:'+08:00',hovst:'+08:00',irkt:'+08:00',krast:'+08:00',myt:'+08:00',pht:'+08:00',sgt:'+08:00',ulat:'+08:00',wita:'+08:00',pyt:'+08:30',acwst:'+08:45',awdt:'+09:00',chost:'+09:00',i:'+09:00',irkst:'+09:00',jst:'+09:00',kst:'+09:00',pwt:'+09:00',tlt:'+09:00',ulast:'+09:00',wit:'+09:00',yakt:'+09:00',acst:'+09:30',aest:'+10:00',chut:'+10:00',chst:'+10:00',ddut:'+10:00',k:'+10:00',pgt:'+10:00',vlat:'+10:00',yakst:'+10:00',yapt:'+10:00',acdt:'+10:30',lhst:'+10:30',aedt:'+11:00',bst:'+11:00',kost:'+11:00',l:'+11:00',lhdt:'+11:00',magt:'+11:00',nct:'+11:00',nft:'+11:00',pont:'+11:00',sakt:'+11:00',sbt:'+11:00',sret:'+11:00',vlast:'+11:00',vut:'+11:00',anast:'+12:00',anat:'+12:00',fjt:'+12:00',gilt:'+12:00',m:'+12:00',magst:'+12:00',mht:'+12:00',nfdt:'+12:00',nrt:'+12:00',nzst:'+12:00',petst:'+12:00',pett:'+12:00',tvt:'+12:00',wakt:'+12:00',wft:'+12:00',chast:'+12:45',fjst:'+13:00',nzdt:'+13:00',phot:'+13:00',tkt:'+13:00',tot:'+13:00',wst:'+13:00',chadt:'+13:45',lint:'+14:00',tost:'+14:00'};
const monthRe = "(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)";
const dayRe = "(\\d\\d?)[stnrdh]{0,2}";
const dateRe = `(?:${monthRe}\\W+${dayRe}|${dayRe}\\W+${monthRe})\\W+(\\d{4})`;
const timeRe = "(\\d\\d?):(\\d\\d)(?:\\W*([ap])\\.?m)?(?:\\W*([a-z]{1,5}))?";
const regex = RegExp(`^\\W*${dateRe}(?:\\W+${timeRe})?\\W*$`, "i");
function toDateTimeStringFormat(s) {
const match = s.toLowerCase().match(regex);
if (!match) return;
let [, m1, day, d2, m2, year, hour, minute, pm, zone] = match;
const month = 1 + (monthRe.indexOf(m1 ?? m2) >> 2); // month name to number
day ??= d2;
zone = zones[zone] ?? ""; // timezone code to offset
hour ??= "0";
minute ??= "0";
if (pm) hour = String((+hour % 12) + 12 * (pm == "p")); // to 24h range
return `${year}-${month}-${day}T${hour}:${minute}${zone}`
.replace(/(?<!\d)\d(?!\d)/g, "0$&"); // pad single digit numbers
}
const tests = [
'Apr 21, 2023,06:51 pm EDT',
'Apr 21, 2023 06:51pm EDT',
'Apr 21, 2023 06:51 pm EDT',
'7th Feb 2023',
'Dec 6th, 2022 19:56 CET',
'Jan 3rd, 2021; 0:04(IST)',
];
for (const test of tests) console.log(toDateTimeStringFormat(test));
当然,这是有限的,如果需要支持更多输入格式,则需要扩展。
评论
0赞
rolling_codes
4/24/2023
是的,我选择了自定义实现。不是首选,但它可以满足我的需要
评论