- Properties allow a class, struct, or module to monitor state changes that occur when accessing fields. When it property is invoked it can set dirty-flags, increment/decrement counters, allocate memory, call events, and any number of other useful side-effects. These side effects can be very important for making sure the internal representation of an object stays in sync with what the external API describes.
- Properties allow the use of calculated values in a seamless manner. It is possible with properties to create a member in a struct or class and have it act like a field, except that it requires no storage. More importantly, some values are derived purely from other values, and keeping the derived value in sync can be difficult if it is stored separately. Properties allow the derived value to be calculated lazily whenever it is needed by external code. One example is given below in "What happens without properties". Another is given here:
- Properties are mostly interoperable with fields. In most cases they can be converted to fields, and fields can be converted to properties, all without changing how they are used externally.
- This is important because it allows APIs to migrate between fields and properties depending on how much expressiveness is needed.
- There is one notable exception to the interoperability: addressing. Fields have addresses in memory, while properties may not have such a thing.
- This point is discussed in more detail here.
- API writers can use properties to enforce a nouns vs verbs distinction.
Suppose some D programmer has just written an API:
They then realize that their API is not as powerful as it could be, so they want to improve it to look something like this:
There are two possibilities for this API's migration:
- The API is broken as the (r,g,b) fields no longer exist.
- The API was written as a function-only API in the first place:
With properties, a third path for migration presents itself:
This distinction is useful in some cases where a single word in english has different meanings as either a noun or a verb. One example is the word "empty" which, in the context of a data structure like a range in D2, could either be a function that empties the range (verb) or a property which indicates whether the range is empty or not (noun). In this case, seeing array.empty() tells the reader that the array gets turned into a zero-element array, while array.empty looks like an expression that returns a boolean indicating whether the array is empty or not. In D2, the noun-form is used for ranges. Another example likely to come up in programming is "transform" where object.transform() seems like it transforms the object while object.transform seems like it is a child of object and it describes how to do transformations.
D's current implementation of properties has had a number of complaints leveled against it:
- (1) Property modification by side effects is forbidden. "array.length++;" gives the error "Error: array.length is not an lvalue". This results from "array.length++;" being rewritten as "array.length()++;", an expression that should rightfully not compile.
- (2) The modification of property members leads to surprising results. If a is a property, then "a.b = 3;" will do absolutely nothing when executed.
- This results from "a.b = 3;" being rewritten as "a().b = 3;". Unfortunately, this expression does compile.
- There are cases where side effects on a member of a property are permissible, such as when b is a function: "b().c(5)". It might become more clear when written as something like "GetProcessOutput().Output(5);". (Suppose GetProcessOutput() returns a struct with references to other objects.)
- Another case where side effects on a member of a property are useful is in proxy structs:
- The presence of the above useful cases for rvalue mutation makes this problem more difficult to solve, since simply forbidding all rvalue mutation will harm D's expressiveness.
- There is a ticket in D's bug tracker for this.
- (3) D's properties do not have a distinctive syntax, so IDEs and 3rd party tools cannot identify properties. As an example, SharpDevelop is shown below analysing some C# code with properties. Notice how it identifies different kinds of class members with different icons in the frame on the right.
- (4) It is impossible for the property writer to specify how a property should be called (with, without (), or either) and have the compiler enforce the convention.
- (5) The delegate ambiguity. It is not possible to write properties that return delegates or function pointers:
- (6) It violates DRY in many cases, needing to duplicate the type and name (or some variant of the name) numerous times:
Problems (1) and (2) can also be seen in some problems with operator overloading:
Notably, C#, a language with explicit properties, simply gives errors in cases (1) and (2) when a struct is returned from a property.
- D's current properties are fairly easy to implement.
- Omissible parentheses are a part of D's template syntax: i.e. myTemplate!int vs. myTemplate!(int)
- D's current properties allow a concept that has been refered to as function-property duality. This duality allows function call chaining to become cleaner in some cases:
The resulting code does exactly what most programmers would expect: the array's length is increased by one, allocating more memory as needed.
The idea is to rewrite calls to properties such that if a getter is found within an expression that produces side effects, then the setter is guaranteed to be called. Put more plainly: if there is any chance the property should be altered by the expression, then the setter will definately be called. If there is no chance the property should be altered by the expression, then the setter will not be called. The rewrite only applies to expressions with side effects:
This change would break backwards compatibility. That's because when a getter returns lvalues like objects or ref returns (Note: ref return properties are currently impossible due to a bug), it will currently be accepted by the compiler and the returned value will be modified by the enclosing expression, but the setter will not be called. Such code is reasonable and useful. Example:
Another solvable issue worth mentioning is when there are side effects to the left of the property in the expression:
Examples of expression rewriting:
- Solves (1) and (2). By making the rewrite recognize not just properties but also opIndex, this can solve some issues with opIndex and opIndexAssign overloading as well.
- Requires no new syntax. If explicit syntax IS used, then this is still required to solve (1) and (2) in a complete manner.
- Breaks backwards compatibility, but only in cases that become errors or were likely bugs to begin with.
One of the ways to solve some of the problems with D's current property implementation is to create a new property implementation with explicit syntax. This syntax should unambiguously tell the compiler that the declarations in question are properties, not functions. The old implementation of properties ( omissible parentheses) may or may not be removed. Below are some advantages and disclaimers regarding property syntax.
Property syntax advantages in general:
- It becomes easy to flag "a.b = 3;" as an error when 'a' is some property that returns a value type such as a struct or int. Thus (2) is solved, but only partially.
- It allows IDEs to determine what variables are properties as opposed to functions or fields. (3)
- It is clear that the getters and setters should not be called with parentheses. This helps with the noun and verb convention as well as ensuring that the properties can be migrated back to fields later. This solves problem (4).
- The delegate ambiguity is resolved. (5)
- Addition of this feature does not break the backwards compatibility of D. The removal of omissible parentheses does. Note that explicit properties and omissible parentheses may coexist peacefully.
- It is possible to give properties optional backing storage when they are declared. This helps DRY in a common case where properties access the state of some similarly named variable:
What property syntax doesn't solve:
- "array.length()++;". There are some suggestions that can solve this, but usually at the expense of exhaustively defining operator overloads. Also, defining a proxy struct that defines opPostInc() and is returned from array.length() could potentially allow this to work, but again with the downside of exhaustively defining overloads. (1) is not solved by syntax alone.
- "a.b = 3;": Making struct value returns work as expected. If a is a property that returns some struct, then "a.b = 3;" can be safely assumed to be an error without limiting the expressive power of functions returning rvalues. However, to make this call a's setter, some manner of property call rewriting is needed.
Properties vs Properties:
Explicit properties and omissible parentheses type properties are not exclusive to each other. Something may be defined as a function or a property, but not both at the same time. This means it is never ambiguous as to whether a function or a property is being called. Indeed, it may be useful to have them both be included in D.
Numerous suggestions have been proposed for property syntax. They tend to fall into one of two categories: defining properties as an attribute of a function, or defining properties as their own declaration complete with scope and members. Some known suggestions are given below.
- Given that these properties have their own scope, they have the potential to intercept and override operators on the thing they are accessing. Such properties can also be populated with fields, functions, and other properties.
- Type consolidation. In D's current property syntax, it is necessary to repeat the property's type: once for the getter's return, once for the setter's return, and another time for the setter's argument. With a property declaration, it is possible to define the type just once in the property's header and never again in the getter or setter.
- It is remotely possible, with these types of syntax, to solve problems (1) and (2) without doing expression rewriting. This would be accomplished with operator overloads in the property's scope. Unfortunately, this has a couple downsides: it requires an enormous amount of boiler-plate code and would leave the opIndex/opIndexAssign lvalue-ness problem unresolved.
- This tends to introduces two levels of nesting for executable parts of each property. Property-as-attribute solutions tend to do this with just one.
The main disadvantage to this one when compared to the other declaration type solutions is that it requires two keywords: get and set. The last variation did not, however, have this disadvantage. Also, get and set are context dependant, so it is possible for the compiler to allow their usage elsewhere. Still, syntax highlighting a context dependant keyword can sometimes fail, and other 3rd party tools might have a problem with it.
A keyword (such as "property" in the below example) is used to distinguish a property declaration.
- Given a property attribute solution, a property-declaration solution can possibly be produced using templates and nested structs/classes. The result may not look pretty though:
The idea is to rewrite "foo = 42;" into "opSet_foo(42);".
This syntax has a difficult technical limitation: when another field in the same scope as the property is defined with the same name, then it is ambiguous as to which one is called:
This is very similar to the suggestion above, and inherits the same ambiguity difficulties.
This is notable different from the previous suggestion due to making the first character of the property's identifier be case insensitive. This is mostly to allow flexibility in naming convention. This has been critiqued as being inconsistent with the rest of the language since there are no other instances of case-insensitivity in D.
The only problem is when a declaration but not definition is desired:
This syntax was mostly received poorly when it appeared on the newsgroup.
These are solutions where a keyword is used to distinguish functions as being strictly properties. The keyword can be a new keyword, a recycled keyword, or something keyword-like.
These solutions have the advantage that they can be applied as blocks and thus reduce the duplication of property annotation, which is good for DRY.
The return value of getters has "out" attached. The return value of setters has "in" attached.
It has been noted that this syntax can be confusing when placed in close proximity to the other usages of "in" and "out":
This also doesn't work as well as a block attribute, since it only allows getters to be grouped with getters and setters to be grouped with setters.
On the upside, it requires no new keywords.
This solution depends on the implementation of another feature: annotations.
It carries the advantage that property does not have to be a keyword, and can thus be used as an identifier elsewhere in code. Other keywords can also be phased out using the annotations feature.
These tables describes which migrations of declarations will work. The first describes the migration possibilities in D as it currently stands. The second describes the migration possibilities after expression rewriting and explicit property syntax have been introduced.
- NA Fields are Non-Addressable fields. The address-of operator (&field) cannot be used on them. All fields in SafeD are of this type.
- PropFuncs are property-functions. All functions currently in D with 0 or 1 arguments are property getters and setters respectively.
- Functions in the tables below are functions in the strict sense. A call to one of these MUST be done with parentheses. Any function in D currently written with 0 or 1 arguments will not fall into this category, as there is no way to enforce the use of parentheses. D functions with 2 or more arguments fall into this category.
* Fields can be migrated to non-addressable fields in most cases except for, of course, addressing. It is possible for user code to create a pointer to one, but impossible to create a pointer to the other.
** The field types can be migrated to property-functions, but only sometimes. This is because if the field starts as a value type (such as an int, float, or struct) then it cannot be migrated to a property-function without breaking existing user code. Worst yet, the kind of broken code that results may compile fine but require hours of debugging.
The terminology in this second table is the same as the first. Notably, properties have been added, since explicit properties are assumed to exist in D for this table.
* Fields can be migrated to non-addressable fields and property-alikes, with the exception of when their address is taken.
** Properties can not always be migrated to fields because they may be in a class and the user might inherit from the class and override the property. They can be migrated back to fields when they are a member of a struct, a final class, or are marked final.
- Going from the first table to the second, property-functions gained a promotion in what could be migrated to them. There are no longer corner-cases where value types (such as ints, floats, and structs) are incapable of being migrated to a property or property-function.
- Property-functions are a one-way road. Fields and properties can be migrated to them, but it is not possible to get back or migrate from property-functions to anything else. This is because property-functions may or may not be invoked with parentheses, a characteristic that is lacking from every other type of member.
- Property-functions and functions are exclusive to each other when it comes to the 0 or 1 argument varieties. Either omissible parentheses are kept, or everything that was once a property-function becomes a strict function. This enivitability can be avoided by adding an annotation for functions that makes the distinction explicit, though such a thing would have little or no benefits.
- Functions cannot be migrated to or from anything without breaking user code. Anything that starts life off as a function is quite frozen that way.
It is possible to have omissible parentheses, lvalue expression rewriting, and explicit properties implemented all at the same time. This solves all of the problems while retaining the advantages of omissible parentheses.
Advantages: Everything mentioned directly above.
- Of all the solutions to the current property problems, this is the most difficult to implement. This is because it entails the difficulty of implementing property expression rewriting plus the difficulty of implementing explicit property syntax, all while asserting that D compilers must support omissible parentheses. Then again, omissible parentheses aren't that hard to support, so this might not be more difficult than any other solution that solves all of the problems.
- Some programmers might still complain that omitting parentheses from function calls or writing things like the below example is sloppy programming, and that they don't want to ever deal with such code written by other D programmers.
- It is possible to implement a property two different ways. This may be confusing to new D users.
Add your comments here...
- The example "writefln = 42" has not compiled for a while. It should be replaced by by a more general example. (such as "foo = 42", "func = 42", "free_func = 42" or "global_func = 42". (Got it. --ChadJoan) Or change it to writefln = "hi" which still compiles, I believe.
- Omissible parentheses are also a part of D's template syntax: i.e. myTemplate!int vs. myTemplate!(int)
- The noun-verb section should include a section on the use of context/usage to disambiguate meaning. See http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=94924 for the relevant forum thread.
- "the properties can be migrated back to fields later". This is false, as a property cannot be migrated back if a sub-class (which may exist in 3rd party code) which has over-overrided the property exists. (Check. --ChadJoan)
- Problem (3) should mention 'ditto' D docs as a mitigating feature.
- Problem (4) should contain all three calling options: (with or without () or either-or). During the forum discussions there were some very clear examples of either-or being correct, (std.string.split IIRC) (Done. --ChadJoan)
- Problem (5) The delegate ambiguity contains a bad example of the problem. The actual problem isn't a zero-arg delegate returned from a zero-arg function. The issue is that a.b(5) will never be re-written a.b()(5) currently. This is actually a variant on (1) and (2). The current (5) should be merged with (1) or (2), and then limited to the double zero-arg corner case. There's also a solution to the ambiguity when limited to semantic rewriting discussed briefly here: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=94932
- BTW ref return properties don't work right yet either.
I'm not sure I agree about merging the zero-arg-void-return case into (1) and (2), since it cannot be solved by semantic rewriting and is completely solved with explicit syntax, which is entirely unlike (1) and (2). Also it has little or nothing to do with lvalueness, and everything to do with syntactic ambiguities. If I'm missing something here let me know. I'm more inclined to expand (5) such that it mentions the ambiguity in the general case and has links to the dmd bugs. Please respond with more information. Otherwise, I'll expand (5) when I get some more time.
Poster above me, thank you for the high quality critique.
A clarification: I think zero-arg-void-return case deserves it's own point, seperate from the more general case (which may or may not be joined with points 1 and/or 2) as the ZAVR case, at best, results in a compilier-time error without an explicit property syntax.
All classes also have a .classinfo property. Information can be found under Phobos:object.
"Properties are member functions that can by syntactically treated as if they were fields."
Q: By the example it seems as though all member functions of a struct/class can be treated as properties. Is this the case? I don't see any specific qualification that determines if a member function is a property or not.
A: I think the only restriction is that the member function has to take zero or one argument. If it takes zero arguments you can always use it like a propertly rvalue; if it takes one argument then you can always use it like a property lvalue.
Note that this property syntax extends to ordinary functions as well:
A: I think there's a reason for this. A common mistake is to try to do "s.length++;". I know what I want the compiler to do, but D's designer is afraid that someone will try something like "s[s.length ++] = foo;" which would have an undefined result. -- JustinCalvarese
A: object.count(object.count() + 1) is shown the reason of prohibiting properties to be the lvalue. the expression has side-effect. I think you didn't want it in some case, but you can't find out, if compiler acquiesce.
Corresponding page in the D Specification