Skip to content

Implement Instant parsing in common module #106

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion core/common/src/Instant.kt
Original file line number Diff line number Diff line change
Expand Up @@ -123,12 +123,15 @@ public expect class Instant : Comparable<Instant> {

/**
* Parses a string that represents an instant in ISO-8601 format including date and time components and
* the mandatory `Z` designator of the UTC+0 time zone and returns the parsed [Instant] value.
* time zone offset.
*
* Examples of instants in ISO-8601 format:
* - `2020-08-30T18:43Z`
* - `2020-08-30T18:43:00Z`
* - `2020-08-30T18:43:00.500Z`
* - `2020-08-30T18:43:00.123456789Z`
* - `2020-08-30T18:43:00+01:00`
* - `2020-08-30T18:43:00+0100`
*
* @throws IllegalArgumentException if the text cannot be parsed or the boundaries of [Instant] are exceeded.
*/
Expand Down
141 changes: 141 additions & 0 deletions core/common/src/InstantParser.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
/*
* Copyright 2019-2021 JetBrains s.r.o.
* Use of this source code is governed by the Apache 2.0 License that can be found in the LICENSE.txt file.
*/

package kotlinx.datetime

import kotlin.math.min
import kotlin.math.pow

internal fun parseInstantCommon(string: String): Instant = parseIsoString(string)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parseInstantCommon facade may try multiple formats before giving up such as #83 if that gets implemented.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the need for such extensibility arises, we have little enough code to be able to add it quickly.


/*
* The algorithm for parsing time and zone offset was adapted from
* https://github.com/square/moshi/blob/aea17e09bc6a3f9015d3de0e951923f1033d299e/adapters/src/main/java/com/squareup/moshi/adapters/Iso8601Utils.java
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please advise on the proper formulation of attribution that shall be added.

*/
private fun parseIsoString(isoString: String): Instant {
try {
val dateTimeSplit = isoString.split('T', ignoreCase = true)
if (dateTimeSplit.size != 2) {
throw DateTimeFormatException("ISO 8601 datetime must contain exactly one (T|t) delimiter.")
}
val localDate = LocalDate.parse(dateTimeSplit[0])

// Iso8601Utils.parse
val timePart = dateTimeSplit[1]
var offset = 0
val hour = parseInt(timePart, offset, offset + 2).also { offset += 2 }
if (checkOffset(timePart, offset, ':')) {
offset += 1
}
val minutes = parseInt(timePart, offset, offset + 2).also { offset += 2 }
if (checkOffset(timePart, offset, ':')) {
offset += 1
}

var seconds = 0
var nanosecond = 0
// seconds and fraction can be optional
if (timePart.length > offset) {
val c = timePart[offset]
if (c != 'Z' && c != 'z' && c != '+' && c != '-') {
seconds = parseInt(timePart, offset, offset + 2).also { offset += 2 }
if (seconds > 59 && seconds < 63) { // https://github.com/Kotlin/kotlinx-datetime/issues/5
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is #5 applicable here, or should this be converted to idiomatic range check?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is applicable. That said, this line is strange in any case: the existing parser for LocalDateTime doesn't recognize leap seconds, so we have a disconnect between

        Instant.parse("2020-06-14T01:01:61Z")

successfully parsing and

        LocalDateTime.parse("2020-06-14T01:01:61")

failing.

Also, it should be noted Java Time's parser throws on seconds outside of [0; 60). Why should we accept leap seconds and why only in Instant and not in LocalDateTime?

seconds = 59 // truncate up to 3 leap seconds
}
if (checkOffset(timePart, offset, '.')) {
offset += 1
val endOffset =
indexOfNonDigit(timePart, offset + 1) // assume at least one digit
val parseEndOffset =
min(endOffset, offset + 9) // parse up to 9 digits
val fraction = parseInt(timePart, offset, parseEndOffset)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this code deals with fractions with more than nine digits that are representable as nanoseconds is by truncating the extra digits towards zero—not even by rounding them. So,

Instant.parse("2020-06-14T01:01:59.1234567899Z")
// results in 2020-06-14T01:01:59.123456789Z

This is highly questionable, as, again, parsing LocalDateTime throws in such cases, as does parsing Instant in Java. This makes sense, given that silently losing the user input is highly undesirable. Maybe an argument could be made that rounding is a sensible solution here, but even then the reasoning would have to be applied consistently across the parsers we have.

nanosecond = (10.0.pow(9 - (parseEndOffset - offset)) * fraction).toInt()
offset = endOffset
}
}
}

// extract timezone
if (timePart.length <= offset) {
throw IllegalArgumentException("No time zone indicator in '$timePart'")
}
val timezone: TimeZone
val timezoneIndicator = timePart[offset]
if (timezoneIndicator == 'Z' || timezoneIndicator == 'z') {
timezone = TimeZone.UTC
} else if (timezoneIndicator == '+' || timezoneIndicator == '-') {
val timezoneOffset = timePart.substring(offset)
// 18-Jun-2015, tatu: Minor simplification, skip offset of "+0000"/"+00:00"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments have little meaning without the commit history, so they probably shouldn't be included here.

if ("+0000" == timezoneOffset || "+00:00" == timezoneOffset) {
timezone = TimeZone.UTC
} else {
val timezoneId = "UTC$timezoneOffset"
timezone = TimeZone.of(timezoneId)
val act = timezone.id
if (act != timezoneId) {
/* 22-Jan-2015, tatu: Looks like canonical version has colons,
* but we may be given one without. If so, don't sweat.
* Yes, very inefficient. Hopefully not hit often.
* If it becomes a perf problem, add 'loose' comparison instead.
*/
val cleaned = act.replace(":", "")
if (cleaned != timezoneId) {
throw IllegalTimeZoneException(
"Mismatching time zone indicator: "
+ timezoneId
+ " given, resolves to "
+ timezone.id
)
}
}
}
} else {
throw DateTimeFormatException("Invalid time zone indicator '$timezoneIndicator'")
}
return localDate.atTime(hour, minutes, seconds, nanosecond).toInstant(timezone)
} catch (e: NumberFormatException) {
throw DateTimeFormatException(e)
}
}

/**
* Check if the expected character exist at the given offset in the value.
*
* @param value the string to check at the specified offset
* @param offset the offset to look for the expected character
* @param expected the expected character
* @return true if the expected character exist at the given offset
*/
private fun checkOffset(value: String, offset: Int, expected: Char): Boolean {
return (offset < value.length) && (value[offset] == expected)
}

/**
* Parse an integer located between 2 given offsets in a string
*
* @param value the string to parse
* @param beginIndex the start index for the integer in the string
* @param endIndex the end index for the integer in the string
* @return the int
* @throws NumberFormatException if the value is not a number
*/
@OptIn(ExperimentalStdlibApi::class)
private fun parseInt(value: String, beginIndex: Int, endIndex: Int): Int {
if ((beginIndex < 0) || (endIndex > value.length) || (beginIndex > endIndex)) {
throw NumberFormatException(value)
}
return value.substring(beginIndex, endIndex).toInt()
}

/**
* Returns the index of the first character in the string that is not a digit, starting at offset.
*/
private fun indexOfNonDigit(string: String, offset: Int): Int {
for (i in offset until string.length) {
val c = string[i]
if (c < '0' || c > '9') return i
}
return string.length
}
54 changes: 54 additions & 0 deletions core/common/test/InstantTest.kt
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,13 @@ class InstantTest {
@Test
fun parseIsoString() {
val instants = arrayOf(
Triple("1970-01-01T0000Z", 0, 0),
Triple("1970-01-01T00:00Z", 0, 0),
Triple("1970-01-01T000000Z", 0, 0),
Triple("1970-01-01T00:00:00Z", 0, 0),
Triple("1970-01-01t00:00:00Z", 0, 0),
Triple("1970-01-01T00:00:00z", 0, 0),
Triple("1970-01-01t00:00:00z", 0, 0),
Triple("1970-01-01T00:00:00.0Z", 0, 0),
Triple("1970-01-01T00:00:00.000000000Z", 0, 0),
Triple("1970-01-01T00:00:00.000000001Z", 0, 1),
Expand All @@ -80,11 +84,61 @@ class InstantTest {
}

assertInvalidFormat { Instant.parse("x") }
assertInvalidFormat { Instant.parse("1970-01-01T00:00.1Z") }
assertInvalidFormat { Instant.parse("12020-12-31T23:59:59.000000000Z") }
// this string represents an Instant that is currently larger than Instant.MAX any of the implementations:
assertInvalidFormat { Instant.parse("+1000000001-12-31T23:59:59.000000000Z") }
}

@Test
fun isoTimezoneOffsets() {
val validOffsets = arrayOf(
"1970-01-01T00:00:00Z",
"1970-01-01T00:00:00z",

"1970-01-01T00:00:00+00:00",
"1970-01-01T00:00:00+0000",

"1970-01-01T01:00:00+01:00",
"1970-01-01T01:00:00+0100",

"1970-01-01T18:00:00+18:00",
"1970-01-01T18:00:00+1800",

"1970-01-01T00:01:00+00:01",
"1970-01-01T00:01:00+0001",

"1969-12-31T23:00:00-01:00",
"1969-12-31T23:00:00-0100",

"1969-12-31T06:00:00-18:00",
"1969-12-31T06:00:00-1800",

"1969-12-31T23:59:00-00:01",
"1969-12-31T23:59:00-0001",
)
validOffsets.forEach {
assertEquals(0, Instant.parse(it).toEpochMilliseconds())
}

val invalidOffsets = arrayOf(
"1970-01-01T18:01:00+18:01",
"1970-01-01T18:01:00+1801",

"1969-12-31T05:59:00-18:01",
"1969-12-31T05:59:00-1801",

"1970-01-01T01:00:00+01",
"1970-01-01T01:00:00+01",

"1970-01-01T01:00:00+1:00",
"1970-01-01T01:00:00+100",
)
invalidOffsets.forEach {
assertFailsWith<IllegalArgumentException> { Instant.parse(it) }
}
}

@OptIn(ExperimentalTime::class)
@Test
fun instantCalendarArithmetic() {
Expand Down
7 changes: 1 addition & 6 deletions core/js/src/Instant.kt
Original file line number Diff line number Diff line change
Expand Up @@ -75,12 +75,7 @@ public actual class Instant internal constructor(internal val value: jtInstant)
if (epochMilliseconds > 0) MAX else MIN
}

actual fun parse(isoString: String): Instant = try {
Instant(jtInstant.parse(isoString))
} catch (e: Throwable) {
if (e.isJodaDateTimeParseException()) throw DateTimeFormatException(e)
throw e
}
actual fun parse(isoString: String): Instant = parseInstantCommon(isoString)
Copy link
Collaborator

@dkhalanskyjb dkhalanskyjb Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be implemented easily—and consistently with the other parsers—by parsing a ZonedDateTime OffsetDateTime and then converting that to Instant.

The same for the Java implementation.


actual fun fromEpochSeconds(epochSeconds: Long, nanosecondAdjustment: Long): Instant = try {
/* Performing normalization here because otherwise this fails:
Expand Down
7 changes: 1 addition & 6 deletions core/jvm/src/Instant.kt
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ package kotlinx.datetime
import kotlinx.datetime.serializers.InstantIso8601Serializer
import kotlinx.serialization.Serializable
import java.time.DateTimeException
import java.time.format.DateTimeParseException
import java.time.temporal.ChronoUnit
import kotlin.time.*
import java.time.Instant as jtInstant
Expand Down Expand Up @@ -62,11 +61,7 @@ public actual class Instant internal constructor(internal val value: jtInstant)
actual fun fromEpochMilliseconds(epochMilliseconds: Long): Instant =
Instant(jtInstant.ofEpochMilli(epochMilliseconds))

actual fun parse(isoString: String): Instant = try {
Instant(jtInstant.parse(isoString))
} catch (e: DateTimeParseException) {
throw DateTimeFormatException(e)
}
actual fun parse(isoString: String): Instant = parseInstantCommon(isoString)

actual fun fromEpochSeconds(epochSeconds: Long, nanosecondAdjustment: Long): Instant = try {
Instant(jtInstant.ofEpochSecond(epochSeconds, nanosecondAdjustment))
Expand Down
51 changes: 1 addition & 50 deletions core/native/src/Instant.kt
Original file line number Diff line number Diff line change
Expand Up @@ -23,54 +23,6 @@ public actual enum class DayOfWeek {
SUNDAY;
}

// This is a function and not a value due to https://github.com/Kotlin/kotlinx-datetime/issues/5
// org.threeten.bp.format.DateTimeFormatterBuilder.InstantPrinterParser#parse
private val instantParser: Parser<Instant>
get() = localDateParser
.chainIgnoring(concreteCharParser('T').or(concreteCharParser('t')))
.chain(intParser(2, 2)) // hour
.chainIgnoring(concreteCharParser(':'))
.chain(intParser(2, 2)) // minute
.chainIgnoring(concreteCharParser(':'))
.chain(intParser(2, 2)) // second
.chain(optional(
concreteCharParser('.')
.chainSkipping(fractionParser(0, 9, 9)) // nanos
))
.chainIgnoring(concreteCharParser('Z').or(concreteCharParser('z')))
Copy link
Collaborator

@dkhalanskyjb dkhalanskyjb Apr 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the other parsers are implemented via ZonedDateTime OffsetDateTime, there is no need to have the parser for Instant in the common code, and the only place that requires an explicit implementation is the native part. Then, it's probably easier to modify this parser at this line so that it accepts any ZoneOffset and not just Z (and propagate this change correspondingly).

.map {
val (dateHourMinuteSecond, nanosVal) = it
val (dateHourMinute, secondsVal) = dateHourMinuteSecond
val (dateHour, minutesVal) = dateHourMinute
val (dateVal, hoursVal) = dateHour

val nano = nanosVal ?: 0
val (days, hours, min, seconds) = if (hoursVal == 24 && minutesVal == 0 && secondsVal == 0 && nano == 0) {
listOf(1, 0, 0, 0)
} else if (hoursVal == 23 && minutesVal == 59 && secondsVal == 60) {
// parsed a leap second, but it seems it isn't used
listOf(0, 23, 59, 59)
} else {
listOf(0, hoursVal, minutesVal, secondsVal)
}

// never fails: 9_999 years are always supported
val localDate = dateVal.withYear(dateVal.year % 10000).plus(days, DateTimeUnit.DAY)
val localTime = LocalTime.of(hours, min, seconds, 0)
val secDelta: Long = try {
safeMultiply((dateVal.year / 10000).toLong(), SECONDS_PER_10000_YEARS)
} catch (e: ArithmeticException) {
throw DateTimeFormatException(e)
}
val epochDay = localDate.toEpochDay().toLong()
val instantSecs = epochDay * 86400 + localTime.toSecondOfDay() + secDelta
try {
Instant(instantSecs, nano)
} catch (e: IllegalArgumentException) {
throw DateTimeFormatException(e)
}
}

/**
* The minimum supported epoch second.
*/
Expand Down Expand Up @@ -243,8 +195,7 @@ public actual class Instant internal constructor(actual val epochSeconds: Long,
actual fun fromEpochSeconds(epochSeconds: Long, nanosecondAdjustment: Int): Instant =
fromEpochSeconds(epochSeconds, nanosecondAdjustment.toLong())

actual fun parse(isoString: String): Instant =
instantParser.parse(isoString)
actual fun parse(isoString: String): Instant = parseInstantCommon(isoString)

actual val DISTANT_PAST: Instant = fromEpochSeconds(DISTANT_PAST_SECONDS, 999_999_999)

Expand Down