diff --git a/04.internals.md b/04.internals.md index 2ea6a31eb..313039886 100644 --- a/04.internals.md +++ b/04.internals.md @@ -212,7 +212,7 @@ ECMA component of the engine is responsible for the following notions: * Data representation * Runtime representation -* Garbage collection (gc) +* Garbage collection (GC) ## Data representation @@ -225,8 +225,8 @@ The major structure for data representation is `ECMA_value`. The lower two bits ![ECMA value representation]({{ site.baseurl }}/img/ecma_value.png){: class="thumbnail center-block img-responsive" } -In case of number, string and object the value contains an encoded pointer. -Simple value is a pre-defined constant which can be: +In case of number, string and object the value contains an encoded pointer, and +simple value is a pre-defined constant which can be: * undefined * null @@ -234,13 +234,13 @@ Simple value is a pre-defined constant which can be: * false * empty (uninitialized value) -For other value types the higher bits of `ECMA_value` structure contains compressed pointer to the real value. - ### Compressed pointers Compressed pointers were introduced to save heap space. ![Compressed Pointer]({{ site.baseurl }}/img/ecma_compressed.png){: class="thumbnail center-block img-responsive" } +These pointers are 8 byte alligned 16 bit long pointers which can address 512 Kb of memory which is also the maximum size of the JerryScript heap. + ECMA data elements are allocated in pools (pools are allocated on heap) Chunk size of the pool is 8 bytes (reduces fragmentation). @@ -257,90 +257,54 @@ Several references to single allocated number are not supported. Each reference ### String +Strings in JerryScript are not just character sequences, but can hold numbers and so-called magic ids too. For common character sequences there is a table in the read only memory that contains magic id and character sequence pairs. If a string is already in this table, the magic id of its string is stored, not the character sequence itself. Using numbers speeds up the property access. These techniques save memory. + ### Object / Lexical environment -Object and lexical environment structures, 8 bytes each, have common (GC) header: - * Stack refs counter - * Next object/lexical environment in list of objects/lexical environments +An object can be a conventional data object or a lexical environment object. Unlike other data types, object can have references (called properties) to other data types. Because of circular references, reference counting is not always enough to determine dead objects. Hence a chain list is formed from all existing objects, which can be used to find unreferenced objects during garbage collection. The `gc-next` pointer of each object shows the next allocated object in the chain list. + +Lexical environments ([link](http://www.ecma-international.org/ecma-262/5.1/#sec-10.2)) are implemented as objects in JerryScript, since lexical environments contains key-value pairs (called bindings) like objects. This simplifies the implementation and reduces code size. + +![Object/Lexicat environment structures]({{ site.baseurl }}/img/ecma_object.png){: class="thumbnail center-block img-responsive" } + +The objects are represented as following structure: + + * Reference counter - number of hard (non-property) references + * Next object pointer for the garbage collector * GC's visited flag - * is_lexenv flag + * type (function object, lexical environment, etc.) -Remaining fields of these structures are different and are shown on the figure below. +### Properties of objects +![Object properties]({{ site.baseurl }}/img/ecma_object_property.png){: class="thumbnail center-block img-responsive" } -![Object/Lexicat environment structures]({{ site.baseurl }}/img/ecma_object.jpg){: class="thumbnail center-block img-responsive" } +Objects have a linked lists that contains their properties. This list actually contains property pairs, in order to save memory described in the followings: +A property is 7 bit long and its type field is 2 bit long which consumes 9 bit which does not fit into 1 byte but consumes 2 bytes. Hence, placing together two properties (14 bit) with the 2 bit long type field fits into 2 bytes. -### Property of an object / description of a lexical environment variable - -While objects comprise of properties, lexical environments consist of variables. Both of these units are tied up into lists. Unit types could be different: - * named data (property or variable) - * named accessor (property) - * internal (implementation defined) - -All these units occupy 8 bytes and have common header: - * type - 2 bit - * next property/variable in the object/lexical environment (compressed pointer) - -The remaining parts are differnt: -![Object property/lexcial environment variable]({{ site.baseurl }}/img/ecma_object_property.jpg){: class="thumbnail center-block img-responsive" } +If the number of property pairs reach a limit (currently this limit defined to 16), the first element of the property pair list is a hashmap (called property hashmap), which is used to find a property instead of finding it by linear search. ### Collections -ECMA runtime utilizes collections for intermediate calculations. Collection consists of a header and a number of linked chunks, which hold collection values. +Collections are array-like data structures, which are optimized to save memory. Actually, a collection is a linked list whose elements are not single elements, but arrays which can contain multiple elements. -Header occupies 8 bytes and consists of: - * compressed pointer to the next chunk - * number of elements - * rest space, aligned down to byte, is for the first chunk of data in collection +### Internal properties -Chunk's layout is following: - * compressed pointer to the next chunk - * rest space, aligned down to byte, is for data stored in corresponding part of the collection +Internal properties are special properties that carry meta-information that cannot be accessed by the JavaScript code, but important for the engine itself. Some examples of internal properties are listed below: -### Internal properties: - -* [[Class]] - class of the object (ECMA-defined) -* [[Prototipe]] - is stored in object description -* [[Extensible]] - is stored in object description -* [[CScope]] - lexical environment (function's variable space) -* [[ParametersMap]] - arguments object -0 code of the function -* [[Code]] - where to find bytecode of the function -* native code - where to find code of native unction -* native handle - some uintptr_t assosiated with the objec -* [[FormalParameters]] - collection of pointers to ecma_string_t (the list of formal parameters of the function) -* [[PrimitiveValue]] for String - for String object -* [[PrimitiveValue]] for Number - for Number object -* [[PrimitiveValue]] for Boolean - for Boolean object -* built-in related: - * built-in id - id of built-in object - * built-in routine id - id of built-in routine - * "non-instantiated" mask - what built-in properties where notinstantiated yet (lazy instantiation) - * extention object identifier +* [[Class]] - class (type) of the object (ECMA-defined) +* [[Code]] - points where to find bytecode of the function +* native code - points where to find the code of a native function +* [[PrimitiveValue]] for Boolean - stores the boolean value of a Boolean object +* [[PrimitiveValue]] for Number - stores the numeric value of a Number object ### LCache -LCache is a cache for property variable search requests. +LCache is a hashmap for finding a property specified by an object and by a property name. The object-name-property layout of the LCache presents multiple times in a row as it is shown in the figure below. ![LCache]({{ site.baseurl }}/img/ecma_lcache.png){: class="thumbnail center-block img-responsive"} -The entries of LCache has the following layout: - * object (pointer to object) - * property name (pointer to string) - * property (pointer to property) +When a property access occurs, a hash value is extracted form the demanded property name and than this hash is used to index the LCache. After that, in the indexed row the specified object and property name will be searched. -The layout above presents multiple times in row. The rows of LCache is indexed by property name hash. When a property access occurs, all row's entries are searched by comparing object pointer and property name according entry's fields, full comparison is used for property name. - -If corresponding entry was found, its property pointer is returned (may be NULL - in case when there is no property with specified name in given object). -Otherwise, the property set of the considered object is iterated over and the corresponding record is registered in LCache (with property pointer if it was found or NULL otherwise). - -## Runtime - -ECMA-defined runtime operations are implemented mostly with routine having the following signature: - -`ecma_completion_value_t ecma_op_* ([ecma_value_t arguments])` -or -`ecma_property_t * ecma_op_[find/get]*_property (objs, name string, ...)` - -However, there could be some combinations. +It is important to note, that if the specified property is not found in the LCache, it does not mean that it does not exist. If the property is not found, it will be searched in the property-list of the object, and if it is found there, the property will be placed into the LCache. ### Completion value diff --git a/css/img.css b/css/img.css index 14d88219a..7c50b6486 100644 --- a/css/img.css +++ b/css/img.css @@ -19,7 +19,7 @@ img[alt="byte-code layout"] { } img[alt="ECMA value representation"] { - max-width: 40%; + max-width: 50%; display: block; } diff --git a/img/ecma_object.jpg b/img/ecma_object.jpg deleted file mode 100644 index 77bcc50fd..000000000 Binary files a/img/ecma_object.jpg and /dev/null differ diff --git a/img/ecma_object.png b/img/ecma_object.png new file mode 100644 index 000000000..b0d45450d Binary files /dev/null and b/img/ecma_object.png differ diff --git a/img/ecma_object_property.jpg b/img/ecma_object_property.jpg deleted file mode 100644 index fa71da8da..000000000 Binary files a/img/ecma_object_property.jpg and /dev/null differ diff --git a/img/ecma_object_property.png b/img/ecma_object_property.png new file mode 100644 index 000000000..10fcdb182 Binary files /dev/null and b/img/ecma_object_property.png differ