diff --git a/engine/core/html/parser/BackgroundHTMLParser.cpp b/engine/core/html/parser/BackgroundHTMLParser.cpp index 67213072a41fafb886b616e95a2e87fccd0e8ea7..4807d1b153ce9c61c801c40e2e5277f10342691b 100644 --- a/engine/core/html/parser/BackgroundHTMLParser.cpp +++ b/engine/core/html/parser/BackgroundHTMLParser.cpp @@ -113,22 +113,11 @@ bool BackgroundHTMLParser::updateTokenizerState(const CompactHTMLToken& token) { if (token.type() == HTMLToken::StartTag) { const String& tagName = token.data(); - // FIXME: This is just a copy of Tokenizer::updateStateFor which uses threadSafeMatches. - if (threadSafeMatch(tagName, HTMLNames::scriptTag)) - m_tokenizer->setState(HTMLTokenizer::ScriptDataState); - else if (threadSafeMatch(tagName, HTMLNames::styleTag)) + if (threadSafeMatch(tagName, HTMLNames::scriptTag) || threadSafeMatch(tagName, HTMLNames::styleTag)) m_tokenizer->setState(HTMLTokenizer::RAWTEXTState); } - if (token.type() == HTMLToken::EndTag) { - const String& tagName = token.data(); - if (threadSafeMatch(tagName, HTMLNames::scriptTag)) { - m_tokenizer->setState(HTMLTokenizer::DataState); - return false; - } - } - - return true; + return token.type() != HTMLToken::EndTag || !threadSafeMatch(token.data(), HTMLNames::scriptTag); } void BackgroundHTMLParser::pumpTokenizer() diff --git a/engine/core/html/parser/HTMLTokenizer.cpp b/engine/core/html/parser/HTMLTokenizer.cpp index 8fbf9ac43ec401411e423ab7b51c519e621f93cf..7e765b277ea71084fe31a34ad8f7308930a44c34 100644 --- a/engine/core/html/parser/HTMLTokenizer.cpp +++ b/engine/core/html/parser/HTMLTokenizer.cpp @@ -86,10 +86,6 @@ static inline bool isEndTagBufferingState(HTMLTokenizer::State state) switch (state) { case HTMLTokenizer::RAWTEXTEndTagOpenState: case HTMLTokenizer::RAWTEXTEndTagNameState: - case HTMLTokenizer::ScriptDataEndTagOpenState: - case HTMLTokenizer::ScriptDataEndTagNameState: - case HTMLTokenizer::ScriptDataEscapedEndTagOpenState: - case HTMLTokenizer::ScriptDataEscapedEndTagNameState: return true; default: return false; @@ -231,18 +227,6 @@ bool HTMLTokenizer::nextToken(SegmentedString& source, HTMLToken& token) } END_STATE() - HTML_BEGIN_STATE(ScriptDataState) { - if (cc == '<') - HTML_ADVANCE_TO(ScriptDataLessThanSignState); - else if (cc == kEndOfFileMarker) - return emitEndOfFile(source); - else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataState); - } - } - END_STATE() - HTML_BEGIN_STATE(TagOpenState) { if (cc == '!') HTML_ADVANCE_TO(MarkupDeclarationOpenState); @@ -377,325 +361,6 @@ bool HTMLTokenizer::nextToken(SegmentedString& source, HTMLToken& token) } END_STATE() - HTML_BEGIN_STATE(ScriptDataLessThanSignState) { - if (cc == '/') { - m_temporaryBuffer.clear(); - ASSERT(m_bufferedEndTagName.isEmpty()); - HTML_ADVANCE_TO(ScriptDataEndTagOpenState); - } else if (cc == '!') { - bufferCharacter('<'); - bufferCharacter('!'); - HTML_ADVANCE_TO(ScriptDataEscapeStartState); - } else { - bufferCharacter('<'); - HTML_RECONSUME_IN(ScriptDataState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEndTagOpenState) { - if (isASCIIUpper(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(toLowerCase(cc))); - HTML_ADVANCE_TO(ScriptDataEndTagNameState); - } else if (isASCIILower(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataEndTagNameState); - } else { - bufferCharacter('<'); - bufferCharacter('/'); - HTML_RECONSUME_IN(ScriptDataState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEndTagNameState) { - if (isASCIIUpper(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(toLowerCase(cc))); - HTML_ADVANCE_TO(ScriptDataEndTagNameState); - } else if (isASCIILower(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataEndTagNameState); - } else { - if (isTokenizerWhitespace(cc)) { - if (isAppropriateEndTag()) { - m_temporaryBuffer.append(static_cast(cc)); - FLUSH_AND_ADVANCE_TO(BeforeAttributeNameState); - } - } else if (cc == '/') { - if (isAppropriateEndTag()) { - m_temporaryBuffer.append(static_cast(cc)); - FLUSH_AND_ADVANCE_TO(SelfClosingStartTagState); - } - } else if (cc == '>') { - if (isAppropriateEndTag()) { - m_temporaryBuffer.append(static_cast(cc)); - return flushEmitAndResumeIn(source, HTMLTokenizer::DataState); - } - } - bufferCharacter('<'); - bufferCharacter('/'); - m_token->appendToCharacter(m_temporaryBuffer); - m_bufferedEndTagName.clear(); - m_temporaryBuffer.clear(); - HTML_RECONSUME_IN(ScriptDataState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapeStartState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapeStartDashState); - } else - HTML_RECONSUME_IN(ScriptDataState); - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapeStartDashState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedDashDashState); - } else - HTML_RECONSUME_IN(ScriptDataState); - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapedState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedDashState); - } else if (cc == '<') - HTML_ADVANCE_TO(ScriptDataEscapedLessThanSignState); - else if (cc == kEndOfFileMarker) { - parseError(); - HTML_RECONSUME_IN(DataState); - } else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapedDashState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedDashDashState); - } else if (cc == '<') - HTML_ADVANCE_TO(ScriptDataEscapedLessThanSignState); - else if (cc == kEndOfFileMarker) { - parseError(); - HTML_RECONSUME_IN(DataState); - } else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapedDashDashState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedDashDashState); - } else if (cc == '<') - HTML_ADVANCE_TO(ScriptDataEscapedLessThanSignState); - else if (cc == '>') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataState); - } else if (cc == kEndOfFileMarker) { - parseError(); - HTML_RECONSUME_IN(DataState); - } else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapedLessThanSignState) { - if (cc == '/') { - m_temporaryBuffer.clear(); - ASSERT(m_bufferedEndTagName.isEmpty()); - HTML_ADVANCE_TO(ScriptDataEscapedEndTagOpenState); - } else if (isASCIIUpper(cc)) { - bufferCharacter('<'); - bufferCharacter(cc); - m_temporaryBuffer.clear(); - m_temporaryBuffer.append(toLowerCase(cc)); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeStartState); - } else if (isASCIILower(cc)) { - bufferCharacter('<'); - bufferCharacter(cc); - m_temporaryBuffer.clear(); - m_temporaryBuffer.append(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeStartState); - } else { - bufferCharacter('<'); - HTML_RECONSUME_IN(ScriptDataEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapedEndTagOpenState) { - if (isASCIIUpper(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(toLowerCase(cc))); - HTML_ADVANCE_TO(ScriptDataEscapedEndTagNameState); - } else if (isASCIILower(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataEscapedEndTagNameState); - } else { - bufferCharacter('<'); - bufferCharacter('/'); - HTML_RECONSUME_IN(ScriptDataEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataEscapedEndTagNameState) { - if (isASCIIUpper(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(toLowerCase(cc))); - HTML_ADVANCE_TO(ScriptDataEscapedEndTagNameState); - } else if (isASCIILower(cc)) { - m_temporaryBuffer.append(static_cast(cc)); - addToPossibleEndTag(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataEscapedEndTagNameState); - } else { - if (isTokenizerWhitespace(cc)) { - if (isAppropriateEndTag()) { - m_temporaryBuffer.append(static_cast(cc)); - FLUSH_AND_ADVANCE_TO(BeforeAttributeNameState); - } - } else if (cc == '/') { - if (isAppropriateEndTag()) { - m_temporaryBuffer.append(static_cast(cc)); - FLUSH_AND_ADVANCE_TO(SelfClosingStartTagState); - } - } else if (cc == '>') { - if (isAppropriateEndTag()) { - m_temporaryBuffer.append(static_cast(cc)); - return flushEmitAndResumeIn(source, HTMLTokenizer::DataState); - } - } - bufferCharacter('<'); - bufferCharacter('/'); - m_token->appendToCharacter(m_temporaryBuffer); - m_bufferedEndTagName.clear(); - m_temporaryBuffer.clear(); - HTML_RECONSUME_IN(ScriptDataEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataDoubleEscapeStartState) { - if (isTokenizerWhitespace(cc) || cc == '/' || cc == '>') { - bufferCharacter(cc); - if (temporaryBufferIs(HTMLNames::scriptTag.localName())) - HTML_ADVANCE_TO(ScriptDataDoubleEscapedState); - else - HTML_ADVANCE_TO(ScriptDataEscapedState); - } else if (isASCIIUpper(cc)) { - bufferCharacter(cc); - m_temporaryBuffer.append(toLowerCase(cc)); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeStartState); - } else if (isASCIILower(cc)) { - bufferCharacter(cc); - m_temporaryBuffer.append(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeStartState); - } else - HTML_RECONSUME_IN(ScriptDataEscapedState); - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataDoubleEscapedState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedDashState); - } else if (cc == '<') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedLessThanSignState); - } else if (cc == kEndOfFileMarker) { - parseError(); - HTML_RECONSUME_IN(DataState); - } else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataDoubleEscapedDashState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedDashDashState); - } else if (cc == '<') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedLessThanSignState); - } else if (cc == kEndOfFileMarker) { - parseError(); - HTML_RECONSUME_IN(DataState); - } else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataDoubleEscapedDashDashState) { - if (cc == '-') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedDashDashState); - } else if (cc == '<') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedLessThanSignState); - } else if (cc == '>') { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataState); - } else if (cc == kEndOfFileMarker) { - parseError(); - HTML_RECONSUME_IN(DataState); - } else { - bufferCharacter(cc); - HTML_ADVANCE_TO(ScriptDataDoubleEscapedState); - } - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataDoubleEscapedLessThanSignState) { - if (cc == '/') { - bufferCharacter(cc); - m_temporaryBuffer.clear(); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeEndState); - } else - HTML_RECONSUME_IN(ScriptDataDoubleEscapedState); - } - END_STATE() - - HTML_BEGIN_STATE(ScriptDataDoubleEscapeEndState) { - if (isTokenizerWhitespace(cc) || cc == '/' || cc == '>') { - bufferCharacter(cc); - if (temporaryBufferIs(HTMLNames::scriptTag.localName())) - HTML_ADVANCE_TO(ScriptDataEscapedState); - else - HTML_ADVANCE_TO(ScriptDataDoubleEscapedState); - } else if (isASCIIUpper(cc)) { - bufferCharacter(cc); - m_temporaryBuffer.append(toLowerCase(cc)); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeEndState); - } else if (isASCIILower(cc)) { - bufferCharacter(cc); - m_temporaryBuffer.append(static_cast(cc)); - HTML_ADVANCE_TO(ScriptDataDoubleEscapeEndState); - } else - HTML_RECONSUME_IN(ScriptDataDoubleEscapedState); - } - END_STATE() - HTML_BEGIN_STATE(BeforeAttributeNameState) { if (isTokenizerWhitespace(cc)) HTML_ADVANCE_TO(BeforeAttributeNameState); diff --git a/engine/core/html/parser/HTMLTokenizer.h b/engine/core/html/parser/HTMLTokenizer.h index 3e31f9675d3d690655f37f542c22bc6c6820550e..302b6482e1c2b58500599708fd350a23f3fb2b14 100644 --- a/engine/core/html/parser/HTMLTokenizer.h +++ b/engine/core/html/parser/HTMLTokenizer.h @@ -46,30 +46,12 @@ public: DataState, CharacterReferenceInDataState, RAWTEXTState, - ScriptDataState, TagOpenState, EndTagOpenState, TagNameState, RAWTEXTLessThanSignState, RAWTEXTEndTagOpenState, RAWTEXTEndTagNameState, - ScriptDataLessThanSignState, - ScriptDataEndTagOpenState, - ScriptDataEndTagNameState, - ScriptDataEscapeStartState, - ScriptDataEscapeStartDashState, - ScriptDataEscapedState, - ScriptDataEscapedDashState, - ScriptDataEscapedDashDashState, - ScriptDataEscapedLessThanSignState, - ScriptDataEscapedEndTagOpenState, - ScriptDataEscapedEndTagNameState, - ScriptDataDoubleEscapeStartState, - ScriptDataDoubleEscapedState, - ScriptDataDoubleEscapedDashState, - ScriptDataDoubleEscapedDashDashState, - ScriptDataDoubleEscapedLessThanSignState, - ScriptDataDoubleEscapeEndState, BeforeAttributeNameState, AttributeNameState, AfterAttributeNameState,